Too much reads to sys.schemas

Sofia Rabin shared this idea 8 months ago
Idea Logged

Hi,

We have in DB to many reads to the following query:

<?query --

select sys.schemas.name 'TABLE_SCHEM', CASE WHEN sys.schemas.name IN ('dbo', 'guest','INFORMATION_SCHEMA','sys','db_owner', 'db_accessadmin', 'db_securityadmin', 'db_ddladmin' ,'db_backupoperator','db_datareader','db_datawriter','db_denydatareader','db_denydatawriter') THEN null ELSE DB_NAME() END 'TABLE_CATALOG' from sys.schemas order by 2, 1

--?>

/a4af1b728e65671effc9bb6f7843c54f


1. What the meaning of this query?

2. Is that affect on YF performance?

3. Why it's status is sleeping?

4. Why it contains so much reads while sys.schemas contains only 16 rows?


Thanks,

Sofia

Comments (20)

photo
0

Hi Sophia,


Apologies for the delayed reply. I have escalated this to one of our senior developers to review and I will come back to you with their answers. Our team here in Melbourne (where all of our developers reside) are in the process of moving offices at the moment so I may not have an answer for you until early next week, but as soon as I get word I will let you know.


Cheers,

Neal

photo
0

Hi Sophia,


Thank you for your patience on this. Having spoken with our developers I must say it is difficult to provide a straight answer for you without some additional information from your system to help us explain. Obviously Yellowfin is making some sort of a call for fetching the list of tables in the database. This happens in a number of places, but we're not sure why it's happening this many times. This call could be something as simple as when a new connection is added to the pool, but will need some debug logs and InfoThreads to investigate further.


This article will show you how to turn on debug logging, and this one will get you ready to run InfoThreads.


With the above additional infromation we can start to piece this together for you.


Cheers,

Neal

photo
0

Hi,

Attached debug logs.

i tried to run the InfoThread file from the link but it's disappear after one second.


Thanks,

Sofia

photo
0

Hi Sofia,


Apologies for the delayed reply, we had a release for v9 last week so it's been a bit busy getting that together. Our developers have said they have found it difficult to pinpoint this without the info threads. I have attached the info thread tool to this ticket to assist.


Cheers,

Neal

photo
0

Hi Sofia,


Hope you are well, just checking in to see if you were able to get the infothreads data we requested? How is your performance of Yellowfin going? Are you seeing any specific issues we can look to address?


Cheers,Neal

photo
0

Hi,

We can't replicate it to get the infothreads because we don't understand what those reads.

It cause very high CPU:

I hope this information can give you some indication:

Attached excel with all the reads at the time of the high CPU,

/a037409a71c5e91891bffa20de33f93e

photo
0

Hi Sofia,


Thank you for the information on the performance impact you are experiencing. And I do understand it is a bit of a Catch 22 situation, you cannot replicate the issue without understand what causes it, likewise we can't understand what causes it without the issue occurring to be able to gather the info threads data.


Unfortunately there is not direct information in the spreadsheet that gives an indication of what processes Yellowfin is running at those times. As difficult as it may be to do, the next time the CPU spikes like that, please run the info threads to gather a few snapshots of what is happening so that we can dig further into this and hopefully find a resolution.


Looking at the timeframes in this image let's try to narrow down any potential culprits. Is there anything about those timeframes that are significant and does the high usage occur roughly at the same time each day? Or weekly? Are there a lot of broadcasts or filter updates scheduled for this time? Is there an increase in user activity at this time? Is there an ETL process that runs then?


Cheers,

Neal

photo
0

Hi,

Unfortunately i couldn't run the info thread when the server CPU was 99%. the server was stuck and until the info thread file was open after click on 'GO' it's just popped up and closed. i'll try again next time with lower CPU.

Meanwhile attached logs ,Who is active and event table data, maybe it can help until i'll get the info threads.


Thanks,

Sofia

photo
0

Hi Sofia,


Thank you for that. I will pass this onto our developers to see if this gives them enough to try and get some more information on those calls for you.


I will probably need to come back to you early next week on this, but I will send through an update as soon as I can get an answer.


Cheers,

Neal

photo
0

Hi Sofia,


Hope you had a good weekend. I am still waiting to get some time from the team to look into the queries, but I wanted to follow up on the following questions, looking at the timeframes in this image let's try to narrow down any potential culprits.

  • Is there anything about those timeframes that are significant and does the high usage occur roughly at the same time each day?
  • Or weekly?
  • Are there a lot of broadcasts or filter updates scheduled for this time?
  • Is there an increase in user activity at this time? I
  • s there an ETL process that runs then?

Can you also zip up your logs folder (provided there is still data in there from the date/time show in those images) and upload them to our FTP found at this link, with the filename 16980.zip so I can take a deeper look there too?


Cheers,

Neal

photo
0

Hi Neal,

Seems like the high CPU was combination of multiple users and big dashboards running, we discussed about it with our weekly conversation with Richard and Nathan, currently we updated the heavy dashboards by Richard and Nathan recommendations.


We still need to understand what are those sys scheams reads.

1. it doesn't happens at the same hours\days.

2. Most of the broadcasts are daily broadcasts, in most of days there is no CPU problem.

3. We have a lot of users. by event table seems like not that much users logged in at the same time.

4. The ETL running on daily basis every 3 hours. at the same time there wan't any blocking due to CPU.


The only thing i found is that at this time there was a lot of reads for every sys.schemas rows.

for example, now all the rows exist but the reads are lower:

/71c96ec52c8da7d21e31bb2766f24bb3

logs folder uploaded as 16980.zip.


Thanks,

Sofia

photo
0

Hi Sofia,


I am glad you were able to modify your dashboards to accommodate for the user numbers and reduce the CPU spikes.


For the understanding of the sys.schemas I will pass on the logs and follow up to see if they help our developers understand the context.


And just to be clear, my questions about similar times, broadcasts, users and ETL were all to do with your performance issues with the CPU spiking, not the sys.schemas calls. With this in mind, can you please run the info thread runner at any stage and we will try to match that up as well.


Cheers,

Neal

photo
0

Hi Sofia,


Thank you for your patience on this one. Finally getting further into this, this query is not SQL we run explicitly, but it will be related to a JDBC call that we are making, which is most likely related to getSchemas() call as part of JDBCVerify & Verify Source (which I can see is enabled in your logs), which tells Yellowfin to check all active connections with background scheduled task, this runs every minute. It checks the connections by returning the schema (this is the call you are seeing). If the schema is not returned the connection is removed from the in use list. This is run every minute or so per connection (that is not in use), per data source. So, if you have a data source that has a max of 40 connections of which are all active, then every minute that each of these connections active (but not currently in use) this call will be triggered. The effect on performance is really only down to volume of requests that your server can handle.


Removing Verify Sources should see these calls go away, however, depending on the original reason for enabling this setting, this may not be advisable.


Does this all make sense? Was there any other questions you had for this? How has your performance been since re-engineering your dashboards?


Cheers,

Neal

photo
0

Hi Sofia,


In addition to the information provided, I have raised an enhancement request to simplify this query, to be considered for future versions of Yellowfin. This will be reviewed by our Product Team and chosen for Enhancements based on feasibility, level of positive impact to the user experience. This post will be updated with any future information relevant to this process.


Cheers,

Neal

photo
0

Hi,

Until then, maybe I can create a job to kill all the sessions every hour? Can i do that? it won't affect on the users logins?


Thanks,

Sofia

photo
0

Hi Sofia,


I don't think that would be a useful workaround as this has the potential to either log users out or lead to data corruption.


At this time, there is no workaround for this, though as previously noted, whilst there are a lot of hits for this the impact on performance should be pretty minimal. If you are having performance issues, please open up a new request with the specifics of the issues you are facing, along with the info threads taken during a time where performance is poor so we can look to address the root causes of this.


Cheers,

Neal

photo
0

Hi Neal,

Thank you for the quick response.

What do you mean by 'lead to data corruption'?


Thanks,

Sofia

photo
0

Hi Sofia,


Sorry, I should have been clearer. There are a couple of ways to take your original suggestion so I should address how I first understood it and then how else it could have been interpreted.


Initially I saw this as running a process to close all DB connection every hour. Yellowfin has a mechanism for this, which is to set the timeout to 1 hour. But, it would only kill an inactive connection 1 hour after it that had been started, not at 1pm then 2pm then 3pm, etc. With this in mind the only way to stop all sessions every hour, on the hour, would involve some kind of hard cutoff (which would need to be outside of Yellowfin) which, if a session was to be killed just as Yellowfin decided to use this, data from that connection could become corrupted.


If you were to simply reduce the Max Open Time from 3hrs to 1hr, this could reduce the number of open connections, but may not make a major impact on the number of reads you are seeing, but unless you notice any ill effects from this, it cannot hurt to try.


Please let me know if you have any further questions.


Cheers,

Neal

photo
0

Hi Neal,

Thank you for the explain.

Let's try to reduce the Max Open Time. Where can i find it?


Thanks,

Sofia

photo
0

Hi Sofia,


I am referring to is the Refresh Time setting on the data source itself. Please see in the image below, this example is set to 3 hours:

f7825773560b5ac326b68121cbd4c022

If you reduce this setting to 1, this will remove/refresh any unused connections after 1 hour which should reduce the number or open connections waiting, and therefore reduce the number of these sys.schemas call. However, this will be dependent on the usage of the system and may have little to no impact on the overall performance of Yellowfin.


Please let me know if you have any further questions on this.


Cheers,

Neal