Limitation on output rows in Trasformation Flow
Answered
Hi,
I've try to use transformation flow to join data from different datasources.
I have some problem with output table. There are only 400 rows in the output source table.
I've tryed severel times, with different input data that include more that 10000 rows , tryed input step, trasformation step and output step, tryed only input step and output step - there always only 400 rows in View, that build on output table.
Can I change this limitation?
Hi Timofey,
I've just done a quick test in my Yellowfin and I was successfully able to output a table of 26,000.
There is a Row Limit set in the Yellowfin Data Source as follows:
so that would definitely be worth checking.
And just in case that isn't the reason for your 400 row output, could you please show a screenshot of the Transformation Summary such as the one below:
and also please zip up and send across all your log files.
thanks,
David
Hi Timofey,
I've just done a quick test in my Yellowfin and I was successfully able to output a table of 26,000.
There is a Row Limit set in the Yellowfin Data Source as follows:
so that would definitely be worth checking.
And just in case that isn't the reason for your 400 row output, could you please show a screenshot of the Transformation Summary such as the one below:
and also please zip up and send across all your log files.
thanks,
David
Hi Dave,
I'm glad that you answered.
The Max Row Limit was turned off.
That is screen from trasformation flow edit window:
That is screen from Transformation summary
After that, I've edited my Transformation Flow - turned on Enable Scheduling, run preview. It was the same 'no logs' in transformation summary. Then I've press Run now button in schedule managment and now it's OK! I see my 20k rows!
Hi Dave,
I'm glad that you answered.
The Max Row Limit was turned off.
That is screen from trasformation flow edit window:
That is screen from Transformation summary
After that, I've edited my Transformation Flow - turned on Enable Scheduling, run preview. It was the same 'no logs' in transformation summary. Then I've press Run now button in schedule managment and now it's OK! I see my 20k rows!
Reply URL
Hi Timofey,
that's great news, I'm glad it's showing the 20k rows now.
Sounds you like maybe you were only doing the preview previously, not the full proper run.
Thanks for sharing the good news!
regards,
David
Hi Timofey,
that's great news, I'm glad it's showing the 20k rows now.
Sounds you like maybe you were only doing the preview previously, not the full proper run.
Thanks for sharing the good news!
regards,
David
Hi Dave,
I faced up with another problem:
I've tryed to merge data from two datasources. As you can see on screenshot there are 6 million rows in the one datasource and 20k in the another. When I tryed to run this Transformation Flow at first time, the output stage take 40 mins and after that YellowFin stopped responding and we had to restert it.
Later, when Transfirmation Flow run by schedule it was canselled status.
In attach you can find our logs.
And the question is why the output stage take so much time and how I can speed up it?
Hi Dave,
I faced up with another problem:
I've tryed to merge data from two datasources. As you can see on screenshot there are 6 million rows in the one datasource and 20k in the another. When I tryed to run this Transformation Flow at first time, the output stage take 40 mins and after that YellowFin stopped responding and we had to restert it.
Later, when Transfirmation Flow run by schedule it was canselled status.
In attach you can find our logs.
And the question is why the output stage take so much time and how I can speed up it?
Hi Timofey,
yes, inserting large amounts of records can be very slow, if you research this on the internet you will find lots of articles and forum posts about this subject matter and advice on how to improve the performance.
As to why the timeout is occurring, most probably it is the timeout on the data source you are inserting into, so it would be good to increase the Timeout value in the Yellowfin data source:
and if that doesn't help then it also might be the connection to the Yellowfin configuration database that is timing out, in which case you should increase the JDBCTimeout value as described in the following Knowledge Base article:
https://community.yellowfinbi.com/knowledge-base/article/how-to-increase-the-connection-timeout-to-the-yellowfin-database
regards,
David
Hi Timofey,
yes, inserting large amounts of records can be very slow, if you research this on the internet you will find lots of articles and forum posts about this subject matter and advice on how to improve the performance.
As to why the timeout is occurring, most probably it is the timeout on the data source you are inserting into, so it would be good to increase the Timeout value in the Yellowfin data source:
and if that doesn't help then it also might be the connection to the Yellowfin configuration database that is timing out, in which case you should increase the JDBCTimeout value as described in the following Knowledge Base article:
https://community.yellowfinbi.com/knowledge-base/article/how-to-increase-the-connection-timeout-to-the-yellowfin-database
regards,
David
Hello again Timofey,
actually I just looked through your log file and saw the following errors which indicate to me that ClickHouse is locking Yellowfin out, so most likely you've got to configure something in ClickHouse to guard against that occurring again.
and also I noticed an out-of-memory error:
which means you should increase your JVM's heap space as described in the following KB article:https://community.yellowfinbi.com/knowledge-base/article/what-is-jvm-max-memory-and-why-should-i-care
also I saw the following PostgreSQL error, so it looks like I was right when I previously said to increase the data source timeout and the JDBCTimeout (I think you are using PostgreSQL for both a data source and the Yellowfin database):
So, in summary, that's 4 things you should do, increase the JVMs memory, increase the YF data source timeout for PostgreSQL, increase the YF data source timeout for ClickHouse , increase the YF timeout for the connection to the YF DB (JDBCTimeout).And if there are still issues after doing all of that then please send the latest log file.
regards,
David
Hello again Timofey,
actually I just looked through your log file and saw the following errors which indicate to me that ClickHouse is locking Yellowfin out, so most likely you've got to configure something in ClickHouse to guard against that occurring again.
and also I noticed an out-of-memory error:
which means you should increase your JVM's heap space as described in the following KB article:https://community.yellowfinbi.com/knowledge-base/article/what-is-jvm-max-memory-and-why-should-i-care
also I saw the following PostgreSQL error, so it looks like I was right when I previously said to increase the data source timeout and the JDBCTimeout (I think you are using PostgreSQL for both a data source and the Yellowfin database):
So, in summary, that's 4 things you should do, increase the JVMs memory, increase the YF data source timeout for PostgreSQL, increase the YF data source timeout for ClickHouse , increase the YF timeout for the connection to the YF DB (JDBCTimeout).And if there are still issues after doing all of that then please send the latest log file.
regards,
David
Hi Timofey,
just checking how you are getting on with this matter, did you get a chance yet to try those 4 suggestions, and if so, how did it go?
regards,
David
Hi Timofey,
just checking how you are getting on with this matter, did you get a chance yet to try those 4 suggestions, and if so, how did it go?
regards,
David
Hi David,
I have tried those 4 suggestions, but unfortunatly the Trasformation Flow take too much time. So, for our goals, we'he found anouther decision - we connect PostgreSQL database to Clickhouse, compare it there and use standart YellowFin view option.
Hi David,
I have tried those 4 suggestions, but unfortunatly the Trasformation Flow take too much time. So, for our goals, we'he found anouther decision - we connect PostgreSQL database to Clickhouse, compare it there and use standart YellowFin view option.
Hi Timofey,
well I'm sorry to hear that the Data Transformation took too much time for you, although congratulations on coming up with another solution - it sounds like a clever and different way around the problem and that's what good I.T. projects are all about!
regards,
David
Hi Timofey,
well I'm sorry to hear that the Data Transformation took too much time for you, although congratulations on coming up with another solution - it sounds like a clever and different way around the problem and that's what good I.T. projects are all about!
regards,
David
Replies have been locked on this page!