Data Transformation Truncation Issue

Edgar Kautzner shared this problem 4 months ago
Defect Logged

Hi

I'm troubleshooting an issue with a (fairly simple) data transformation and hoping to get some assistance.

There are two steps in the data flow:

- Extract from query (this extracts four columns from a DB)

- Load to db (this loads the output to a table)

When I run the transformation, the input step succeeds, however the output fails with the error java.lang.Throwable: Data truncation.

The only thing a little out of the ordinary with the data returned by the query is that one of the columns has a high degree of precision. It is of data type DECIMAL(30,20).


I tried to resolve the issue by changing the precision on the input step to 20, however this did not help.

I couldn't find anything useful in the data source or jdbc logs, however the following error was present in yellowfin.log file:


YF:2019-08-02 10:36:29:ERROR (ETLOutputToDatabaseStep:endRows) - Error: java.sql.BatchUpdateException: Data truncation

java.sql.BatchUpdateException: Data truncation

at net.sourceforge.jtds.jdbc.JtdsStatement.executeBatch(JtdsStatement.java:1077)

at com.hof.mi.process.CSVViewProcess.insertRows(CSVViewProcess.java:1273)

at com.hof.mi.etl.step.definition.ETLOutputToDatabaseStep.bulkRowInsert(ETLOutputToDatabaseStep.java:1309)

at com.hof.mi.etl.step.definition.ETLOutputToDatabaseStep.preEndRows(ETLOutputToDatabaseStep.java:1379)

at com.hof.mi.etl.step.AbstractETLRowStep.endRows(AbstractETLRowStep.java:233)

at com.hof.mi.etl.step.AbstractETLRowStep.endRows(AbstractETLRowStep.java:245)

at com.hof.mi.etl.step.AbstractETLCachedStep.D(AbstractETLCachedStep.java:189)

at com.hof.mi.etl.step.AbstractETLCachedStep.endRows(AbstractETLCachedStep.java:125)

at com.hof.mi.etl.runner.ExecutionHead.call(ExecutionHead.java:29)

at com.hof.mi.etl.runner.ETLRunnerImpl.run(ETLRunnerImpl.java:170)

at com.hof.mi.util.background.ETLQueueItem.process(ETLQueueItem.java:154)

at com.hof.mi.util.background.ETLQueueItem.process(ETLQueueItem.java:42)

at com.hof.mi.util.background.ExecutionQueue$TaskRunner.call(ExecutionQueue.java:473)

at java.util.concurrent.FutureTask.run(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source)

YF:2019-08-02 10:36:29:ERROR (AbstractETLRowStep:D) - Error

com.hof.mi.etl.ETLException: java.sql.BatchUpdateException: Data truncation

at com.hof.mi.etl.ETLException.getETLException(ETLException.java:45)

at com.hof.mi.etl.ETLException.throwETLException(ETLException.java:38)

at com.hof.mi.etl.step.AbstractETLStep.throwUnhandledETLException(AbstractETLStep.java:668)

at com.hof.mi.etl.step.definition.ETLOutputToDatabaseStep.preEndRows(ETLOutputToDatabaseStep.java:1390)

at com.hof.mi.etl.step.AbstractETLRowStep.endRows(AbstractETLRowStep.java:233)

at com.hof.mi.etl.step.AbstractETLRowStep.endRows(AbstractETLRowStep.java:245)

at com.hof.mi.etl.step.AbstractETLCachedStep.D(AbstractETLCachedStep.java:189)

at com.hof.mi.etl.step.AbstractETLCachedStep.endRows(AbstractETLCachedStep.java:125)

at com.hof.mi.etl.runner.ExecutionHead.call(ExecutionHead.java:29)

at com.hof.mi.etl.runner.ETLRunnerImpl.run(ETLRunnerImpl.java:170)

at com.hof.mi.util.background.ETLQueueItem.process(ETLQueueItem.java:154)

at com.hof.mi.util.background.ETLQueueItem.process(ETLQueueItem.java:42)

at com.hof.mi.util.background.ExecutionQueue$TaskRunner.call(ExecutionQueue.java:473)

at java.util.concurrent.FutureTask.run(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source)


Cheers,

Edgar

Comments (6)

photo
1

Hi Edgar,

Thanks for reaching out. To fully test this, I'll need to know what RDBMS you're testing this with. Unfortunately, the stack trace doesn't provide indication of this.

I went ahead and did this with MySQL anyways to see what happened. If I understand correctly, you're pulling data from four columns, I'm assuming in one of these columns you have a numeric value with either more than 30 places in the whole number and/or 20 decimal places? And you're seeing this error in attempting to then output these columns into a database where one of the columns is set to DECIMAL(30,20). Is this correct?

If so, I think this is to be expected, at least with MySQL. If I simply try to draft an INSERT statement (which is of course ultimately what's happening in the transformation when outputting to a table) in MySQL that contains a number with more than 20 decimal places, the value is automatically truncated.

Here's my example:

INSERT statement featuring less than 20 decimals:

/AQrjNVQXeJ2sAAAAAElFTkSuQmCC

Works without error.

INSERT statement with a value featuring more than 20 decimals:

/Hk4Mnvu6CIIgCIIgCIJwposQRRRRRBFFFFFEEUUUUUQRRRRRRBFFFFFEEUUUUUQRRRRRznn5P3RmG1l6+ewSAAAAAElFTkSuQmCC

I get a warning and if I select this column from my table, you can see all the values I've inserted have 20 decimal places:

/UYaaaSRRv5ENBYFRrwVGB4eRqXqQaHoMtJII4008ieisSgw4q2AsSgw0kgjjfzp+X8A5UrPK+Yk8osAAAAASUVORK5CYII=

I guess ultimately though, the values should still be inserted even if there's some sort of warning being thrown by the database, so please let me know if my understanding above is correct so I can create a column with a number with more than 20 decimal points then try writing it to a table featuring a column set to DECIMAL(30,20) via a Transformation Flow.


Thanks,

Mike

photo
1

Hi Edgar,

I just wanted to check in and see how things are going with this.

Regards,

Mike

photo
1

Hi Mike,

Sorry for the delay in responding. Firstly, the system is using MS SQL Server.

One piece of information I didn't add here was that the truncation is dropping the table at each refresh.

I did some more digging here and noticed that the output of the stored procedure has a data type of DECIMAL(30,20), however the field was being created in Yellowfin as Decimal(38,30). This was causing an overflow on the left hand side of the number as it only allowed 8 numbers, not 10.

I've worked around this by changing the Data Transformation to be a truncate (not drop) and manually adjusting the precision of the target table.

Thanks for your help.

Cheers,

Edgar

photo
1

Hi Edgar,

Thanks for confirming and explaining your solution here. Please let us know if you'd like us to continue troubleshooting the cause of this and if so, can you please provide some screenshots of your setup so we can attempt replication and log this? Basically, I'd just require some screenshots of your Transformation Flow setup as well as receive some information on what the Stored Procedure itself is doing.


Thanks,

Mike

photo
1

Hi Mike,

I believe there is a bug here so it would probably be a good idea to look into it because others are likely to run into it. Personally I'm going to settle for the workaround to avoid spending more time on it.

The DT setup is very simple and contains two steps.

/HzboU6o1ho+JAAAAAElFTkSuQmCC

The first step just executes the stored procedure (EXECUTE dbo.xxxx). This returns a 4 columns of data, one with the high-precision metric.

The second DT just outputs to a DB table. It was originally set to a "drop existing", so I would expect that Yellowfin would detect the precision (from step 1 which ran successfully) and create a table with appropriate precision to load the data into.

Cheers,

Edgar

photo
1

Hi Edgar,

Thanks for reaching out. I've been able to replicate this. Not with the same exact changes, likely due to differences in setup, but the end result is the same.

After executing a Stored Procedure and calling fields with the following values:

/1iY4TRhwoQJ+P+fIK+yal6OuAAAAABJRU5ErkJggg==

And selecting Drop Existing:

/d7U6RJGqR2kAAAAASUVORK5CYII=

They were transformed to:

/H+nJskwF9bGCAAAAABJRU5ErkJggg==


I've gone ahead and logged a defect for this. Any potential updates regarding this will be posted here.

Regards,

Mike