Notebook sessions require a conda environment that has the BDS module of ADS installed.
Using the Vault#
The preferred method to connect to a BDS cluster is to use the
BDSSecretKeeper class. This allows you to store the BDS credentials in
the vault and not the notebook. It also provides a greater level of access control to the secrets and allows for credential rotation
without breaking connections from various sources.
import ads import os from ads.bds.auth import krbcontext from ads.secrets.big_data_service import BDSSecretKeeper from pyhive import hive ads.set_auth('resource_principal') with BDSSecretKeeper.load_secret("<secret_id>") as cred: with krbcontext(principal=cred["principal"], keytab_path=cred['keytab_path']): cursor = hive.connect(host=cred["hive_host"], port=cred["hive_port"], auth='KERBEROS', kerberos_service_name="hive").cursor()
Without Using the Vault#
BDS requires a Kerberos ticket to authenticate to the service. The preferred method is to use the vault and
because it is more secure, and prevents private information from being stored in a notebook. However, if this is not possible,
you can use the
refresh_ticket() method to manually create the Kerberos ticket. This method requires the following parameters:
kerb5_path: The path to the
krb5.conffile. You can copy this file from the master node of the BDS cluster located in
keytab_path: The path to the principal’s
keytabfile. You can download this file from the master node on the BDS cluster.
principal: The unique identity to that Kerberos can assign tickets to.
import ads import fsspec import os from ads.bds.auth import refresh_ticket ads.set_auth('resource_principal') refresh_ticket(principal="<your_principal>", keytab_path="<your_local_keytab_file_path>", kerb5_path="<your_local_kerb5_config_file_path>") cursor = hive.connect(host="<hive_host>", port="<hive_port>", auth='KERBEROS', kerberos_service_name="hive").cursor()
A job requires a conda environment that has the BDS module of ADS installed. It also requires secrets and configuration information that can be used to obtain a Kerberos ticket for authentication. You must copy the
krb5.conf files to the jobs instance and can be copied as part of the job. We recommend that you save them into the vault then use
BDSSecretKeeper to access them. This is secure because the vault provides access control and allows for key rotation without breaking exiting jobs. You can use the notebook to load configuration parameters like
hive_port, and so on. The
krb5.conf files are securely loaded from the vault then saved in the jobs instance. The
krbcontext() method is then used to create the Kerberos ticket. Once the ticket is created, you can query BDS.