ads.dbmixin package#

Submodules#

ads.dbmixin.db_pandas_accessor module#

class ads.dbmixin.db_pandas_accessor.ConnectionFactory[source]#

Bases: object

connectionprovider = {'hive': <class 'ads.bds.big_data_service.ADSHiveConnection'>}#
classmethod get(engine='oracle')[source]#
class ads.dbmixin.db_pandas_accessor.DBAccessMixin[source]#

Bases: object

static read_sql(sql: str, connection_parameters: dict, bind_variables: Dict = {}, chunksize: int | None = None, engine='oracle') DataFrame | Iterator[DataFrame][source]#

Read SQL query from oracle database into a DataFrame.

Parameters:
  • sql (str) – SQL query to be executed.

  • connection_parameters (dict) – A dictionary of connection_parameters - {“user_name”:””, “password”:””, “service_name”:””, “wallet_location”:””}

  • bind_variables (Optional[Dict]) – Key value of pair of bind variables and corresponding values

  • chunksize (Optional[int], default None) – If specified, return an iterator where chunksize is the number of rows to include in each chunk.

  • engine ({'oracle', 'mysql', 'hive'}, default 'oracle') – Select the database type - MySQL/Oracle/Hive to store the data

Returns:

DataFrame or Iterator[DataFrame].

Return type:

DataFrame or Iterator[DataFrame]

Examples

>>> connection_parameters = {
        "user_name": "<username>",
        "password": "<password>",
        "service_name": "{service_name}_{high|med|low}",
        "wallet_location": "/full/path/to/my_wallet.zip",
    }
>>> import pandas as pd
>>> import ads
>>> df = pd.DataFrame.ads.read_sql("SELECT * from Employee", connection_parameters=connection_parameters)
>>> df_with_bind = pd.DataFrame.ads.read_sql("SELECT * from EMPLOYEE WHERE EMPLOYEE_ID = :ID", bind_variables={"ID":"121212", connection_parameters=connection_parameters)
to_sql(table_name: str, connection_parameters: dict, if_exists: str = 'fail', batch_size=100000, engine='oracle', encoding='utf-8')[source]#

To save the dataframe df to database.

Parameters:
  • table_name (str) – Name of SQL table.

  • connection_parameters (dict) – A dictionary of connection_parameters - {“user_name”:””, “password”:””, “service_name”:””, “wallet_location”:””}

  • if_exists (: {'fail', 'replace', 'append'}, default 'fail') – How to behave if the table already exists. * fail: Raise a ValueError. If table exists, do nothing * replace: Drop the table before inserting new values. If table exists, drop it, recreate it, and insert data. * append: Insert new values to the existing table. If table exists, insert data. Create if does not exist.

  • batch_size (int, default 100000) – Inserting in batches improves insertion performance. Choose this value based on available memore and network bandwidth.

  • engine ({'oracle', 'mysql'}, default 'oracle') – Select the database type - MySQL or Oracle to store the data

  • encoding (str, default is "utf-8") – Encoding provided will be used for ecoding all columns, when inserting into table

Returns:

Nothing.

Return type:

None

Examples

>>> connection_parameters = {
        "user_name": "<username>",
        "password": "<password>",
        "service_name": "{service_name}_{high|med|low}",
        "wallet_location": "/full/path/to/my_wallet.zip",
    }
>>> import pandas as pd
>>> import ads
>>> df2 = pd.read_csv("my/data/csv")
>>> df2.ads.to_sql("MY_DATA_CSV", connection_parameters=connection_parameters)

Module contents#