Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48385][SQL] Migrate the jdbc driver of mariadb from 2.x to 3.x #46655

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented May 20, 2024

What changes were proposed in this pull request?

The pr aims to

  • migrate the jdbc driver of mariadb from 2.x to 3.x.
  • upgrade the image version of mariadb from 10.5.12 to 10.5.24.

Why are the changes needed?

1.First PR of Series 3.x:
https://github.com/mariadb-corporation/mariadb-connector-j/releases/tag/3.0.0-alpha
image

2.Seen from the Maven central repo, series 3. x should have become the mainstream, with faster release frequency
https://mvnrepository.com/artifact/org.mariadb.jdbc/mariadb-java-client

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Update existed UT.
  • Pass GA

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot removed the SQL label May 20, 2024
@github-actions github-actions bot added the SQL label May 21, 2024
@@ -46,7 +46,7 @@ class MariaDBKrbIntegrationSuite extends DockerKrbJDBCIntegrationSuite {
override val jdbcPort = 3306

override def getJdbcUrl(ip: String, port: Int): String =
s"jdbc:mysql://$ip:$port/mysql?user=$principal"
s"jdbc:mysql://$ip:$port/mysql?permitMysqlScheme&user=$principal"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@panbingkun panbingkun marked this pull request as ready for review May 22, 2024 07:26
@panbingkun
Copy link
Contributor Author

cc @yaooqinn

@panbingkun panbingkun changed the title [ONLY TEST] Test mariadb 3.x [SPARK-48385][SQL] Migrate the driver of mariadb from 2.x to 3.x May 22, 2024
@panbingkun panbingkun changed the title [SPARK-48385][SQL] Migrate the driver of mariadb from 2.x to 3.x [SPARK-48385][SQL] Migrate the jdbc driver of mariadb from 2.x to 3.x May 22, 2024
@yaooqinn
Copy link
Member

shall we also update the server? https://mariadb.com/kb/en/mariadb-10-5-25-release-notes/

@panbingkun
Copy link
Contributor Author

shall we also update the server? https://mariadb.com/kb/en/mariadb-10-5-25-release-notes/

Sure, Let me verify locally first.

@panbingkun
Copy link
Contributor Author

shall we also update the server? https://mariadb.com/kb/en/mariadb-10-5-25-release-notes/

https://hub.docker.com/_/mariadb/tags?page=&page_size=&ordering=&name=10.5.25
It seems that the official image 10.5.25 is not ready yet, but 10.5.24 is ok.

yaooqinn
yaooqinn previously approved these changes May 22, 2024
assert(rows.get(0).isInstanceOf[Integer])
assert(rows.get(1).isInstanceOf[Long])
assert(rows.get(2).isInstanceOf[Integer])
assert(rows.get(3).isInstanceOf[BigDecimal])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you describe the root cause of this change, @panbingkun ? For me, this looks suspicious like a breaking change.

- assert(rows.get(3).isInstanceOf[Long])
+ assert(rows.get(3).isInstanceOf[BigDecimal])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the root cause of this change

  • change:
    index[0], Short => Integer
    index[1], Integer => Long
    index[3], Long => Decimal

  • Before - (mariadb jdbc driver: 2.7.12):
    image

  • After - (mariadb jdbc driver: 3.4.0):
    image

  • Summary
    the data types obtained from the underlying meta are different

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also add compatible logic in MySQLDialect##getCatalystType. Do we need to do this in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make changes as follows to keep this compatible:
image

If we need to do this, how about making related changes directly in this PR? Or a new seperated PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After - (mariadb jdbc driver: 3.4.0):

These mariadb SQL Data Type <-> JDBC Data Type mapping rules seem incorrect, do they have a statement why they made such change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to find it.

@panbingkun
Copy link
Contributor Author

panbingkun commented May 23, 2024

Through comparative analysis, it is found that

@panbingkun
Copy link
Contributor Author

IMO, it seems that the 3.x version is more reasonable, but we are not compatible with this case in Spark, so I propose that we can make related changes on the Spark side

@yaooqinn @dongjoon-hyun

@yaooqinn
Copy link
Member

IMO, it seems that the 3.x version is more reasonable,

It's unreasonable to me. The 2.x and MySQL official both use java.sql.Types.SMALLINT + Signed signal = [UN]SIGNED SMALLINT

@panbingkun
Copy link
Contributor Author

IMO, it seems that the 3.x version is more reasonable,

It's unreasonable to me. The 2.x and MySQL official both use java.sql.Types.SMALLINT + Signed signal = [UN]SIGNED SMALLINT

Okay, I see what you mean. What you say is reasonable is that you focus on the value returned in 3.x and do not make full use of isSigned. What I say is reasonable is that you focus on its own division of UnsignedSmallIntColumn and SignedSmallIntColumn without paying attention to its isSigned. Let me investigate again.
it should have been like this in the original version of 3.x (from 3.0.3)
mariadb-corporation/mariadb-connector-j@22cff70

@panbingkun
Copy link
Contributor Author

Regarding the above issues, I have proposed a PR to mariadb-connector-j: mariadb-corporation/mariadb-connector-j#194

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants