Large CSV file hangs on "Processing File" indefinitely

Sam shared this problem 4 months ago
Defect Logged

I'm trying to import a .csv file that is 640,000 rows x 8 columns. Yellowfin shows a progress bar that resets a few times, then hangs on "Processing File" indefinitely. When I sample the .csv down to 1,000 rows, it imports almost instantly.

Any advice for importing the larger file (65MB)?

Comments (9)

photo
2

Hi Sam,

Thanks for reaching out. We just so happened to have replicated and raised a defect for this yesterday. It was with an 800k row CSV, but same behavior and all works as expected with a smaller file.

Given this information, there are technically two options:

1. Break the CSV into smaller chunks

2. Wait for updates on the logged defect

I've gone ahead and set this ticket to Defect Logged and I'll keep you posted on any updates from the dev team as they come along.

Regards,

Mike

photo
1

Good to know, thanks Mike! I'll split up the CSV and that will work for now, then when this is addressed I'll go back and update with the full data set.

photo
1

Hi Sam,

You're welcome! I'll keep you posted.

Regards,

Mike

photo
1

Followup info: I cut it down to 375k records. At that size, I can get to the stage where I retitle all the columns and specify data types. From there I got this error message:

/7e4caad659ef426aab1b9f7360ba5cfc

I tried re-importing that CSV with High Precision selected, which caused it to error at the start of the process:


Error: java.lang.OutOfMemoryError: Java heap space
At:
java.lang.StringCoding$StringDecoder.decode(StringCoding.java:149)
java.lang.StringCoding.decode(StringCoding.java:193)
java.lang.StringCoding.decode(StringCoding.java:254)
java.lang.String.<init>(String.java:546)
java.lang.String.<init>(String.java:566)
com.hof.util.Base64.encodeFromStream(Base64.java:812)
...

I'll keep trimming it smaller, but wanted to share these additional error behaviors.

photo
1

Hi Sam,

Thanks for keeping us posted. Performing large ETL transformations can indeed use a lot of memory, so what you ran into here is an out of memory error. This makes me believe that even if the larger file were loading as expected, you wouldn't be able to complete the transformation anyway. How much memory is currently assigned to your Yellowfin? You can find this information next to JVM Max Memory under System Information in your info.jsp file, which can be accessed by appending '/info.jsp' to your login URL. For example, localhost:8080/info.jsp. You can also find this info under Administration > System Information.

Here is our minimum server requirements documentation, and here is how to increase JVM Max Memory, if that winds up being needed.

Please let me know.

Thanks,

Mike

photo
1

JVM max memory:910.5 MB

photo
1

Hi Julius,

That is far below the recommended minimum, which is 15GB dedicated to Yellowfin in the application server. Please increase your JVM Max Memory to as close to that value as you can by following the steps in the article I linked previously and please let me know if you still run into this issue when trying to transform the 375k records.

Regards,

Mike

photo
1

Hi Mike - just closing the loop here that once we updated the JVM Max Memory, the 375k records subset was imported successfully. So this is now back to just the original defect identified above related to a larger number of records. Thanks for the help with the memory problem!

photo
2

Thanks for the update Sam!

photo
photo
2

Hi Sam,

Just writing to let you know this Defect has been fixed and will be published in the next build, 9.6. ETA some time next month. I will let you know once published.

Regards,

Mike