- Create a virtual environment with the same python version as the databricks runtime
virtualenv .venv - In setup.py change the databricks runtime version in install_requires from
databricks-connect==6.2.*to the one you are using. For example if using 5.5 then change it to==5.5.* - Install the current directory with setup.py in the virtual environment
pip install -e .
Follow the steps in the official guide to finish configuring the client.
Follow the steps in the guide for VS Code or Jupyter to configure the IDE.
Note: Check that you don't have SPARK_HOME set to your local spark installation. If set, then unset it or use
python.envFileto set SPARK_HOME to the path returned bydatabricks-connect get-spark-home
Once you have code you want to deploy to databricks.
- Import notebooks directly to databricks from the
.ipynbor.pyfiles. - For the libraries in src/ you will need to build a library and upload it
Use python setup.py bdist_spark or python setup.py bdist_egg to build a library in dist/. Import this library into databricks and install into the databricks cluster.