UTF-8 character problem in the "TEXT" field - Data Transformation(MySQL)

Merve shared this question 42 days ago
Answered

Hello,

I transfer data from one data source to another data source(MySQL databases)

Data transfer appears to be successful. However, the characters in the "TEXT" field appear incorrectly in the target table in database. There are symbols like question mark(?) instead of Turkish characters. So I am having UTF-8 character problem during data tranfser.

My connection settings:

/595fd80c9d3e4296c2fa3b854e672226

I tried "databasename?characterEncoding=utf8" in the Database Name box. But it didn't work.(Yellowfin Version: 9.7.1 and Java Version:11.0.8)

Is there any way to solve this problem?

Regards,

Merve

Comments (1)

photo
1

Hi Merve,


Thanks for your question.


Would you be able to clarify where you are seeing this output? Is it in your Data Source, in Yellowfin, or somewhere different? Does the data look accurate in the DBMS?


I suspect that everything is fine from the Yellowfin side - the connection string is correct. One thing that you could try, is to use a third-party tool such as DBeaver or DBViz using the same connection string and observe the output for the Turkish characters. If the third-party tool also displays Turkish characters with '?', then the issue is most likely Database related.


Then, it would be worth checking the encoding is correct in the Database by running the query below. Where the results should be 'utf-8' rather than 'latin1' or another encoding method.


show variables like 'char%';

Kind regards,

Simon


photo
1

Hi Simon,

Thanks for your answer.

I already use dbeaver as database management tool. Data in accurate in the source database (where the I fetch the data).

The problem occurs in the target database (where the I put the data).


From Yellowfin Data Transformation section, I fetch a table from the source database. Then select the target database :

/078b875a11f2452e082e18764e9651bf


If there is no new table, it creates and transfers data. When I check the newly created table in the target database from dbeaver, there is a question mark (?) instead of Turkish characters in the "TEXT" field.


I open dbeaver, check the properties of the table created with Yellowfin and it looks like this:


/a27c8fb9432c2d2c0b28ed9fc44687a4


I think we should be able to give this setting while creating Database Connection Settings in Yellowfin.

I still looking for a solution.


Kind regards,

Merve

photo
1

Hi Merve,

After some investigation, it seems that MySQL uses the ‘latin1’ charset as default, when client programs make a connection. This article explains. The solution it seems is to set the charset and collation after the client makes a connection to the database, however, I’m not 100% sure how this would work in practice using Yellowfin.

I’ll have to raise a task to have this looked into by our Developers, who might be able to come up with a solution.

Let me know if you have any further questions.

Kind regards,

Simon

photo
1

Hi Merve,


To further provide some guidance here, instead of using the MySQL JDBC Connection URL, you can use the Generic JDBC Connection URL and create the connection string manually, where you can define the charset using ?characterEncoding=utf8 as you suggested earlier.


/da913c9daa202c858155e7a6fc3b7dfa


I'm going to go ahead and mark this as answered for the moment.


Please feel to reach out with any further questions and someone will get back to you shortly.


Kind regards,

Simon

photo