Flatten or combine search results

JonT shared this question 10 months ago
Answered

Our database takes certain data points that in the UI are in one field and creates individual records so that when querying the database you will get 4 records for 1 core record. An example:

/afdb6411552676324c873a6b974a6ae3


This example is one event in the console. In the Event Operation field, there would be a multi-line field that would have these two lines and there could be 100 lines. Is it possible to write a query that flattens this data to effectively create one for all entries that are identical and then group the lines that are not? We would want it to look something like this:


/7011d45b33dd40f0225ebacd4862e59c

It could be separated by any delimiter, but I am not that great at SQL and would ask for direction on how to approach this, if it is possible. I know there is some functionality with GROUP BY, but I don't know if you can build an index for the results and do what I am asking.

Best Answer
photo

Hi Jon,

Thanks for your response. I think there may be a much simpler alternative, if you don't mind your Event Operation column still being in separate rows. It looks like your Event Date, Event Time, Event ID, and Event Host values repeat frequently, so there is a way you can make all these values appear once, but Event Operation will remain in separate rows: you can turn on Suppress Duplicates for each of those columns. It would look something like this:


/10db76be428ae8bfcf5e1faacb68a357

/a52b027320b0b6d1d2a24e3eb5a1e368

Except in your example your first 3 or 4 columns would just display the once, and it would be Event Operation that keeps going. I think this would be by far the easiest way to get close to your desired end result without actually manipulating the data, though of course I'm not sure how important it is to get each Event Operation value essentially grouped by Event ID on an individual row basis.

Unfortunately, aside from this option though, the solution would require some sort of advanced data manipulation that is quite outside the scope of Support. I will say I strongly suspect this to be possible though, assuming my understanding is correct that you're looking to combine all Event Operation values per Event ID into one row. If you are interested in discussing Consulting options, you can reach out to your Account Manager, or I can do so on your behalf, and we can get this looked at further for you. Either way, please let me know how goes and whether you have any further questions.

Regards,

Mike

Comments (5)

photo
2

Hi Jon,

Thanks for reaching out. It looks like your Event Operation field has separate rows for values that are OPERATOR_CLOSED and OWNERSHIP_TAKEN. If you're looking to instead combine those values, your best bet is likely combining them at the data level. This may be possible by writing SQL queries directly against your database, but may also be achieved via ETL, either via our own Data Transformation tool, or a third-party one.

That said, it may technically possible to do this in Yellowfin as well via Group Data functionality, as you've alluded to. Rather, at a broad level, it will work using a simple case such as this but without really knowing your data and business use case, I couldn't speak to the viability of the solution with 100% certainty. Data-specific report building questions such as these typically fall more in the realm of Consulting as they can better spend the time to get to know your desired result and the data itself, but I suspect the following may still work for your use case.

That said, using your example above, here's a situation to mirror your setup:

/eaaf2ec2ff93eed41892fe8d389fe10a

Based on your example, I'd want Gender to instead show up as Female Male, so I'll choose Group Data:

/eab53d98a252d4fab90d5c5ea5392ae6

Then move both values into the field list:

/ee58b1c439bee61c64cd45763345c1e4

Click Save > Submit, and this is the result:

/d97f166d3942452ef637fae5cb677a9d

This is the resulting report SQL query (this may help with debugging.. the WHERE clause is just my filtering so I could limit the data set to one result for each of the first 3 columns):

/b162ae2ff4ebc63f7386060c15c454a6

Please let me know if you have any follow-up questions and whether I can provide any additional pointers.

Regards,

Mike

photo
1

Hi Mike,

Thanks for the information! Yes, I wanted to keep the question simple, but the basics of what I asked are what you seemed to have shown an option for us to try. I will work on this tomorrow and see if I can get it working based on your help.


Thanks again, I'll provide an update tomorrow or Friday!

photo
1

Hi Jon,

Thanks for getting back to me. Sounds good! Please let me know how goes.

Regards,

Mike

photo
1

Hey Mike,

I was trying what you mentioned and think the problem is the number of variables that we will see is going to be too great and dynamic for this to work as I was hoping. The information I work with is somewhat sensitive, so I have cleaned it up a little, but this is a better example of what I was asking.


/b0ca13c76e5aca64ca69917bb3cf0638


In this example, 2 EVENT IDs relate to the multiple entries for EVENT OPERATION. As a larger example, the count could be a total of 10,000 records where multiple values for EVENT OPERATION are related to the same value in EVENT ID. Because we will not know what values will show in any one column, the query would need to be able to take dynamic values for the EVENT ID and then group the values for EVENT OPERATION. I think that is where this becomes more complicated. The grouping is really dependent on the repetition of the value for EVENT ID, where EVENT OPERATION is not a static set of results.


I understand if it's not possible, I just could not wrap my head around how to write a query that was dynamic and capture what we are wanting to see. Thank you again for taking time to look at this, I really appreciate the information from this forum.

photo
1

Hi Jon,

Thanks for your response. I think there may be a much simpler alternative, if you don't mind your Event Operation column still being in separate rows. It looks like your Event Date, Event Time, Event ID, and Event Host values repeat frequently, so there is a way you can make all these values appear once, but Event Operation will remain in separate rows: you can turn on Suppress Duplicates for each of those columns. It would look something like this:


/10db76be428ae8bfcf5e1faacb68a357

/a52b027320b0b6d1d2a24e3eb5a1e368

Except in your example your first 3 or 4 columns would just display the once, and it would be Event Operation that keeps going. I think this would be by far the easiest way to get close to your desired end result without actually manipulating the data, though of course I'm not sure how important it is to get each Event Operation value essentially grouped by Event ID on an individual row basis.

Unfortunately, aside from this option though, the solution would require some sort of advanced data manipulation that is quite outside the scope of Support. I will say I strongly suspect this to be possible though, assuming my understanding is correct that you're looking to combine all Event Operation values per Event ID into one row. If you are interested in discussing Consulting options, you can reach out to your Account Manager, or I can do so on your behalf, and we can get this looked at further for you. Either way, please let me know how goes and whether you have any further questions.

Regards,

Mike

photo
1

Mike, thanks again for your assistance. We decided to use the Suppress Duplicates option and simply did that for a series of the attributes. Overall it gives the perspective we need, which is show event actions grouped together in a more aesthetically appealing way.

Thank you again for your ideas!

photo
1

Hi Jon,

You're most welcome. Glad I could help here! This considered, I'll go ahead and close this case out, but please don't hesitate to reach out with any additional questions or concerns.

Regards,

Mike

photo