Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick Start Demo Issue with source hudi and target iceberg #256

Closed
rahil-c opened this issue Nov 22, 2023 · 8 comments
Closed

Quick Start Demo Issue with source hudi and target iceberg #256

rahil-c opened this issue Nov 22, 2023 · 8 comments

Comments

@rahil-c
Copy link

rahil-c commented Nov 22, 2023

Hi all, im trying to run the quick start demo but noticing the following issue. I created the hudi table

 rahilchertara@rahils-mbp  /tmp/hudi-dataset/people  ls -a
.          .hoodie    city=DFW   city=ORD   city=SFO
..         _delta_log city=NYC   city=SEA
 rahilchertara@rahils-mbp  /tmp/hudi-dataset/people 

And also am using this yaml file

sourceFormat: HUDI
targetFormats:
  - DELTA
  - ICEBERG
datasets:
  -
    tableBasePath: file:///tmp/hudi-dataset/people
    tableName: people
    partitionSpec: city:VALUE

I am see the delta log created but i do not a iceberg metadata folder created. The onetable job also seems to hang for me.

 java -jar utilities/target/utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
SLF4J: Class path contains SLF4J bindings targeting slf4j-api versions 1.7.x or earlier.
SLF4J: Ignoring binding found at [jar:file:/Users/rahilchertara/workplace/onetable/utilities/target/utilities-0.1.0-SNAPSHOT-bundled.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See https://www.slf4j.org/codes.html#ignoredBindings for an explanation.
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2023-11-22 09:13:48 INFO  io.onetable.utilities.RunSync:141 - Running sync for basePath file:///tmp/hudi-dataset/people for following table formats [DELTA, ICEBERG]
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/Users/rahilchertara/workplace/onetable/utilities/target/utilities-0.1.0-SNAPSHOT-bundled.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2023-11-22 09:13:49 INFO  io.onetable.client.OneTableClient:220 - No previous OneTable sync for target. Falling back to snapshot sync.
2023-11-22 09:13:49 INFO  io.onetable.client.OneTableClient:220 - No previous OneTable sync for target. Falling back to snapshot sync.
# WARNING: Unable to get Instrumentation. Dynamic Attach failed. You may add this JAR as -javaagent manually, or supply -Djdk.attach.allowAttachSelf

@sagarlakshmipathy
Copy link
Contributor

Hi @rahil-c - thanks for opening the issue.

I ran the same example you're referring to, but was unable to replicate the issue you're facing. So a few follow up questions.

  1. Are you using JDK11?
  2. While building the jar from source (assuming you built it from main branch), did you see any errors?
  3. Just for good measure, can you pull the latest changes and build the jar once again?

From my tests:

sagarl@dev onetable_256 % cat my_config.yaml 
sourceFormat: HUDI
targetFormats:
  - DELTA
  - ICEBERG
datasets:
  -
    tableBasePath: file:///tmp/hudi-dataset/people
    tableName: people
    partitionSpec: city:VALUE
sagarl@dev onetable_256 % java -jar utilities/target/utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
SLF4J: Class path contains SLF4J bindings targeting slf4j-api versions 1.7.x or earlier.
SLF4J: Ignoring binding found at [jar:file:/Users/sagarl/oss/onetable_256/utilities/target/utilities-0.1.0-SNAPSHOT-bundled.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See https://www.slf4j.org/codes.html#ignoredBindings for an explanation.
2023-11-22 10:28:14 INFO  io.onetable.utilities.RunSync:150 - Running sync for basePath file:///tmp/hudi-dataset/people for following table formats [DELTA, ICEBERG]
2023-11-22 10:28:17 INFO  io.onetable.client.OneTableClient:220 - No previous OneTable sync for target. Falling back to snapshot sync.
2023-11-22 10:28:17 INFO  io.onetable.client.OneTableClient:220 - No previous OneTable sync for target. Falling back to snapshot sync.
# WARNING: Unable to attach Serviceability Agent. You can try again with escalated privileges. Two options: a) use -Djol.tryWithSudo=true to try with sudo; b) echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
2023-11-22 10:28:35 INFO  io.onetable.client.OneTableClient:123 - OneTable Sync is successful for the following formats [DELTA, ICEBERG]
sagarl@dev onetable_256 % ls -la /tmp/hudi-dataset/people                                                               
total 16
drwxr-xr-x@ 11 sagarl  wheel   352 Nov 22 10:28 .
drwxr-xr-x@  4 sagarl  wheel   128 Nov 22 10:23 ..
-rw-r--r--@  1 sagarl  wheel  6148 Nov 22 10:23 .DS_Store
drwxr-xr-x@ 15 sagarl  wheel   480 Nov 22 10:21 .hoodie
drwxr-xr-x   4 sagarl  wheel   128 Nov 22 10:28 _delta_log
drwxr-xr-x@  6 sagarl  wheel   192 Nov 22 10:21 city=DFW
drwxr-xr-x@  8 sagarl  wheel   256 Nov 22 10:21 city=NYC
drwxr-xr-x@  6 sagarl  wheel   192 Nov 22 10:21 city=ORD
drwxr-xr-x@  6 sagarl  wheel   192 Nov 22 10:21 city=SEA
drwxr-xr-x@  6 sagarl  wheel   192 Nov 22 10:21 city=SFO
drwxr-xr-x  12 sagarl  wheel   384 Nov 22 10:28 metadata
sagarl@dev onetable_256 % 

@sagarlakshmipathy
Copy link
Contributor

Alternatively, if you're just running the quickstart example, you can try running the sync with the utilities-0.1.0-beta1-bundled.jar at https://github.com/onetable-io/onetable/packages/1986830

i.e. java -jar ~/Downloads/utilities-0.1.0-beta1-bundled.jar --datasetConfig my_config.yaml

@rahil-c
Copy link
Author

rahil-c commented Nov 23, 2023

Thanks @sagarlakshmipathy for getting back to me.

  1. I am using the jdk 11
  2. I built the jar without issues
  3. pulled latest changes on main and tried again with same issue.

Will try using the beta jar. One thing i saw between your log output and mine is that mine contains this log

# WARNING: Unable to get Instrumentation. Dynamic Attach failed. You may add this JAR as -javaagent manually, or supply -Djdk.attach.allowAttachSelf

which it hangs on indefinitely so not sure is this is an issue.

update

tried also with beta jar and hit same issue still

@sagarlakshmipathy
Copy link
Contributor

Hi @rahil-c,

I have faced this error long back and that was because I was in an old terminal session that was using JDK8, but in your case you have confirmed that you're using JDK11. So I'm not sure how I can replicate this issue.

However, I've raised this PR to add a docker playground where folks can try OneTable.

This will atleast ensure there are no environment related issues. I've tested all combination syncs and it seems to work well. You can wait until it gets merged to main or you can give it a try even now. The doc to use the docker playground is here: https://github.com/onetable-io/onetable/pull/258/files#diff-d713b45afd9e8b5ffc2728e6ba8882871143038694ff6e0141a56606c0738f5e

Let me know how it goes.

@rahil-c
Copy link
Author

rahil-c commented Nov 26, 2023

Very cool thank you @sagarlakshmipathy for the constant effort on this issue, will give it a try!

@rahil-c
Copy link
Author

rahil-c commented Nov 27, 2023

I do want to add one thing on this thread i noticed. On my intel based mac i had no issues running this quick start demo, the issue above is still happening on my m2 pro mac so not sure if its something regarding that.

Using java 11 on both, but i think the intel mac has mvn 3.6.3 version, the m2 running on 3.9.

@sagarlakshmipathy
Copy link
Contributor

thanks for reporting your findings.

I have an m2 mac as well and I don't see any issues while running the quickstart example on my machine (mvn 3.9.4).

sagarl@dev ~ % uname -m     
arm64

sagarl@dev ~ % mvn --version
Apache Maven 3.9.4 (dfbb324ad4a7c8fb0bf182e6d91b0ae20e3d2dd9)
Maven home: /Users/sagarl/Downloads/apache-maven-3.9.4
Java version: 11.0.11, vendor: AdoptOpenJDK, runtime: /Library/Java/JavaVirtualMachines/adoptopenjdk-11.jdk/Contents/Home
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "11.0", arch: "x86_64", family: "mac"

A quick side note: can you double check if you're seeing something similar to the issue mentioned here

@rahil-c
Copy link
Author

rahil-c commented Nov 27, 2023

Ok so i think i finally fixed the issue on my m2 mac, the thing that worked for me was for some reason installing amazon corretto jdk 11 for some reason.

ilchertara@rahils-mbp  ~/workplace/onetable   main ±  java -version
openjdk version "11.0.21" 2023-10-17 LTS
OpenJDK Runtime Environment Corretto-11.0.21.9.1 (build 11.0.21+9-LTS)
OpenJDK 64-Bit Server VM Corretto-11.0.21.9.1 (build 11.0.21+9-LTS, mixed mode)
 rahilchertara@rahils-mbp  ~/workplace/onetable   main ± 

Apache Maven 3.9.5 (57804ffe001d7215b5e7bcb531cf83df38f93546)
Maven home: /opt/homebrew/Cellar/maven/3.9.5/libexec
Java version: 11.0.21, vendor: Amazon.com Inc., runtime: /Library/Java/JavaVirtualMachines/amazon-corretto-11.jdk/Contents/Home
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "14.1.1", arch: "aarch64", family: "mac"

Now when running quick start it does not hang and passes. Thanks @sagarlakshmipathy for the help!

@rahil-c rahil-c closed this as completed Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants