Queries to help reduce a huge Yellowfin DB by only keeping most recent 5 versions of reports

A client had a Yellowfin database that was huge and they wanted to reduce the size of the larger tables by only keeping the 5 most recent versions of reports.

However, because their DB was so huge, the normal queries we give to reduce the ReportInstance and DocumentData tables were taking too long to run, so we came up with the following solution (it's for SQL Server, but the key features of it, i.e. using a TEMP TABLE and also using the ROW_NUMBER feature, are available in the other well-known databases )

This 1st query sets up all likely candidate DocumentIds into a TempTable:

SELECT ri.DocumentId, ri.ReportId, ri.ReportGMTDateTime into ##TempTable 
FROM ContentManagement cm, ReportHeader rh, ReportInstance ri
WHERE cm.ContentEntityCode = 'REPORT'
AND cm.ContentCode = rh.SubCategoryCode
AND rh.ReportId = ri.ReportId
AND ri.DocumentId IS NOT NULL
AND ri.DocumentId != 0
AND ri.RequestTypeCode != 'COMMENT'

and then the next query selects only those that are not the most 5 recent versions

WITH TOPFIVE AS (
    SELECT *, ROW_NUMBER() 
    over (
        PARTITION BY [ReportId] 
        order by [ReportGMTDateTime]
    ) AS RowNo 
    FROM ##TempTable
)
SELECT * FROM TOPFIVE WHERE RowNo > 5

So obviously once you've got your list of DocumentId values from the above query, then it's up to you to join them to other tables that you would like to clean up.

Is article helpful?