Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to override endpoint for pyspark [BUG] #709

Open
konradwudkowski opened this issue Apr 29, 2021 · 4 comments
Open

Unable to override endpoint for pyspark [BUG] #709

konradwudkowski opened this issue Apr 29, 2021 · 4 comments

Comments

@konradwudkowski
Copy link

Describe the bug
I'm trying to override config for a livy endpoint in a pyspark notebook which doesn't work.

To Reproduce

  1. Create a pyspark notebook
  2. Add the following to a cell at the beginning:
%%local
import sparkmagic.utils.configuration as conf

modified_python_creds = {**conf.kernel_python_credentials(), 'url':'http://my-new-endpoint:8998'}
conf.override(conf.kernel_python_credentials.__name__, modified_python_creds)

  1. Confirm creds are updated
%%local
print(conf.kernel_python_credentials())
{'username': '', 'password': '', 'url': 'http://my-new-endpoint:8998', 'auth': 'None'}
  1. Connect to spark and it will try previous kernel_python_credentials instead of the new one

Expected behavior
I'd expect that I can override credentials like above as the technique works for (at least some) other settings, e.g. the following works as expected:

conf.override('shutdown_session_on_spark_statement_errors', True)

If there is a reason to not allow such overriding there should ideally be some error or warning to explain.

Versions:

  • SparkMagic 0.15.0
  • Livy (if you know it) 0.7.0
  • Spark 3.0.1 (EMR 6.2)

Additional context
I'm now aware that as a workaround I can start an IPython notebook, followed by %load_ext sparkmagic.magics and %spark add -l python -u http://livy.example.com and add %%spark to every cell.

@edwardps
Copy link
Contributor

edwardps commented May 9, 2021

+1
This feature enhancement will be very helpful if notebook is in cloud environment where the sparkmagic conf file cannot be generated in advance. Hopefully the team can consider above configuration overriding or provide endpoint configuration through notebook cell for pyspark or spark kernel. Thanks a lot.

@ellisonbg
Copy link
Contributor

We talked with @aggFTW today and it looks like a minor change may fix this. In particular, looks like we will need to call the refresh_configuration function:

https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/sparkmagic/kernels/kernelmagics.py#L429

On line 242:

https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/sparkmagic/kernels/kernelmagics.py#L242

(before self._do_not_call_start_session(u"")).

@edwardps do you want to test that fix and submit a PR?

@edwardps
Copy link
Contributor

Sure thing. I will do the quick test on the suggested fix on my end.

@edwardps
Copy link
Contributor

Hi Brian,

Some updates:

Line 241 and 244 are calling _override_session_settings() to override session related setting(pls. see the method definition below). The endpoint info are saved in credential element like kernel_python_credentials[1].

    @staticmethod
    def _override_session_settings(settings):
        conf.override(conf.session_configs.__name__, settings)

The configure magic seems to be designed to override the spark session parameters only. But combined the overriding code from Konrad(see below), the session can be created for the overridden endpoint. The refresh_configuration method should be added before line 242 and 244 to handle two cases depending on if an existing session has been created or not.

%%local
import sparkmagic.utils.configuration as conf
modified_python_creds = {**conf.kernel_python_credentials(), 'url':'http://my-new-endpoint:8998', 'password':...}

So we can make this quick fix(simply add refresh_configuration) to enable the endpoint overriding. Any thoughts?

[1] https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/example_config.json#L2

edwardps added a commit to edwardps/sparkmagic that referenced this issue May 27, 2021
**Description**
Introduce an override type parameter for configure magic to
support changing the configuration at runtime. To avoid breaking
the default type is session_configs to ensure no behavior change.
Configure type could be kernel_python_credentials, kernel_scala_credentials,
authenticators etc. Also, after overriding the configuration,
refresh_configuration is called to reflect the configuration
changes.

**Motivation**

This change is to address the request:
jupyter-incubator#709

**Testing Done**

Added unit test cases and also manually test the endpoint
overriding in a notebook using following magic.

%%configure -f -t kernel_python_credentials
{"username": "billy","base64_password": "d2VsY29tZTEyMw==",    "url": "http://abc:8998",    "auth": "Basic_Access"}

%%configure -f -t kernel_python_credentials
{"username": "", "password": "",    "url": "http://abc:8998",    "auth": "None"}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants