Joining on Fields From 3 Tables

Johnathon Warren shared this question 2 years ago
Answered

I have 3 tables in my database. Lets call them tables A, B, and C.


Table A has 4 fields: A1, A2, A3, and A4. A1 and A2 are tinyints, org reference codes. A3 is a number. And A4 is a date.


Table B has 4 fields: B1, B2, B3, and B4. B1 and B2 are tinyints, org reference codes. And B3 and B4 are dates.


Finally, Table C has 4 fields: C1, C2, C3, and C4. C1 and C2 are tinyints, org reference codes. C3 is a date. And C4 is a number.


In my batabase, these three tables are joined together in the following way:


  1. SELECT (A3 * C4) as Result
  2. FROM A
  3. INNER JOIN B
  4. ON A1 = B1 AND A4 >= B3 and A4 <= B4
  5. INNER JOIN C
  6. ON A2 = C1 AND B2 = C2


Explanation: Table A contains my primary sales data, which is joined onto various other tables for various rules and other information. Table B contains some locational data, joined from the primary table based on the location ID (A1 and B1) and within a certain date range (A4 within B3 and B4). The locational data within B includes a category ID (B2). The location's category (B2), and the Primary's category (A2) are both used to join table C to retrieve a value needed to multiply the sales number (A3 * C4).


My current view has Table A joining a large number of tables to retrieve whatever information the report requires. Table A also contains a lot of data that isn't normally needed or joined on. So I would strongly prefer not to write a freehand SQL that performs the same function, as this would slow down any reports that use this view.


As near as I can tell, Yellowfin has no way of allowing me to join table C using "A2 = C1 and B2 = C2".


The best alternative solution I can think of is to write a freehand virtual table that performs the join between tables B and C. And then have table A join it appropriately. However, I believe this isn't a particularly efficient solution and side-steps Yellowfin's view optimizer. It could also become particularly problematic as B and C grow in size, joining a ridiculous number of unneeded rows or combinations of categories, dates, and locations that may never exist within A, slowing down queries and reports.

Comments (8)

photo
1

Hi Johnathon,


Thank you for getting in touch. As you have noticed, Yellowfin does not allow circular joins, primarily for performance reasons. While there are other options at the view level such creating second copies of tables, I would recommend approaching this by using only the most necessary joins at the view level, aiming for simplicity, and refining the data set at the report level.


Depending on your data, this can either be done through sub-queries or filtering. Unfortunately, I am still having a hard time visualizing your use case with the data I generated from your description. If you can provide me with sample spreadsheets and a screenshot of your final goal at the report level, I can look into making more case specific recommendations.


I look forward to your response.


Regards,

Nathan

photo
1

My apologies. I had hoped my explanation would be enough. The data and relationships are a little hard to understand... The exact data I am using is considered confidential so I am unable to share it. Instead I have attached an excel file that should approximate what I am trying to achieve.


Changing the data and tables isn't out of the question. The system I'm reporting on is in the process of development. So if you have a recommendation on how to re-shape the data, I would accept that as well.


Looking at the excel now... The best solution would appear to be to add the C4 column to Table A. My concern with this is that any updates to Table C will also need to be made to Table A... Not the most elegant solution, but perhaps the best choice right now?

photo
1

At the report level, the "Sales Result" field is a calculated field. My client requires it to be this way so that they can calculate the modified sales value on any given types of groupings. The calculated field is "SUM(A4 * C4)", so that when used by the client it will total the modified value. Very useful, considering what it actual represents for them. Performing this calculation on large lumps of sales data based on anything from date, to category, to state... This is what Yellowfin excels at!


However, it was recently revealed to me that locations also come in one of 3 tiers, and that it affects the value and multiplier. That the locational tiers can change based on date, just like multiplier can.


The only way I can imagine this working without exponentially increasing query times or information redundancy is with the join listed in my original post. Unfortunately, Yellowfin doesn't appear to have a way of performing the required join...

photo
1

I tried joining on a virtual table...

  1. SELECT B1, B3, B4, C1, C4
  2. FROM B
  3. INNER JOIN C
  4. ON B2 = C2

Which works... For the most part... Until I tried to join A from the Yellowfin View... And it turns out that B3 and B4 are recognised as varchars, even though they are dates internally. And Yellowfin won't allow me to make the join, as it doesn't allow a date to be compared to a non-date.


After a quick google, I found this thread: https://www.yellowfinbi.com/YFForum-Issue-with-creating-Date-field-in-virtual-table-?thread=170989


Now I'm just stumped... I don't know what to do...

photo
1

Hi Johnathon,


Thank you for providing the spreadsheet and I apologize for the delay in response. It is currently Thanksgiving holiday for us in the United States office.


From my initial inspection, I would recommend joining all of your tables within the view as you have described, except for the final b2<->c2 join that creates the circular loop. This will return a result with many invalid rows, where b2!=c2, but these can be removed at the report level.


One idea for how to remove this would be to create a calculated field at the view level that is essentially:

CASE WHEN B2=C2 THEN 1 ELSE 0

Then add a filter to the report that omits all rows where the calculated field is 0.


Please let me know if a report level solution does not meet your needs. Unfortunately, I will not be able to fully recreate and test this issue today. I apologize again, and will try to get a more comprehensive answer to you as soon as possible.


Regards,

Nathan

photo
1

I was afraid that something like that would be the only solution... My concern with that is that the users will have to add that filter to any report they make, and there's a really good chance that they will just mess it up and forget to add it. Which will inflate the other numbers without them ever realising they made a mistake.


At this stage, I think the best solution may be to add the other tiers' values to table C. Remove C2, and instead of C4, it will be C4i, C4ii, C4iii, C4iv etc. Alter the YF view, change the multiplier to a calculated field, something like


  1. CASE WHEN B2 = 1 THEN C4i WHEN B2 = 2 THEN C4ii WHEN B2 = 3 THEN C4iii WHEN B2 = 4 THEN C4iv END


Hopefully they don't plan on adding too many more tiers, but it should mostly be a simple matter of expanding the case statement and modifying the table....


I will have to do tests and see how the performance between the two solutions compare.


Thanks for your suggestion, and I hope you enjoy your holiday!

photo
1

Hi Johnathon,

Thank you for your patience, I have finally had an opportunity to further investigate this issue.

Contrary to my previous message, Yellowfin does not allow circular joins due to a patenting issue, which should be expiring in the near future.

As you are not keen on report level solutions, I would suggest instead approaching this from the level of the underlying data. Within your database, you can create an additional table 'bc' from a join of 'b' and 'c' on b2=c2.


  1. CREATE TABLE public."BC" AS
  2. SELECT * FROM public."B"
  3. INNER JOIN public."C"
  4. ON public."B"."Location Tier (B2)"= public."C"."Location Tier (C2)"

Within the Yellowfin view, you can then use 'BC' instead of 'B' and 'C' separately, joining 'A' to each of the necessary fields. Below are screenshots of my joins at the view level, as well as a preivew of the result, once the final calculated field has been made (this is previewed using 7.3's updated view builder)

cc7274598a0937523371371e63b5d072

fb8ae678b6aa0564b95b56f3165f5989

Please let me know if this solution works for you, or if you have any additional questions relating to this issue.

Regards,

Nathan

photo
1

HI Johnathon,


I was just wondering if you have had any luck in resolving this issue? Please let me know if you have any additional questions, or if you would be okay with me closing this ticket.


Regards,


Nathan