I have a working call to a MS SQL Server using pyodbc + pandas. I was looking to speed it up with dask-sql, but I don’t manage to find an equivalent that works. My organization requires us to use Microsoft Entra MFA for identification, which might be an issue.
This is the pyodbc + pandas call:
import pyodbc
import pandas as pd
server = "tcp:sql-our-database-swec.database.windows.net"
database = "sqldb-OurDatabase-prod"
driver = "{ODBC Driver 18 for SQL Server}"
un = "{my@email.com}"
authentication = "ActiveDirectoryInteractive"
s = f"Driver={driver};Server={server};Port=1433;Database={database};Uid={un};Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;Authentication={authentication}"
cnxn = pyodbc.connect(s)
cursor = cnxn.cursor()
query = f"select table_name from [{database}].INFORMATION_SCHEMA.TABLES where TABLE_TYPE = 'BASE TABLE'"
table_df = pd.read_sql(query, cnxn)
This is my dask-sql call:
import dask.dataframe as ddf
server = "tcp:sql-our-database-swec.database.windows.net"
database = "sqldb-OurDatabase-prod"
driver_plus = "ODBC+Driver+18+for+SQL+Server"
un = "{my@email.com}"
authentication = "ActiveDirectoryInteractive"
ssa = f"mssql+pyodbc://{server}/{database}?driver={driver_plus}&Authentication={authentication}?Trusted_Connection=yes"
query = f"select table_name from [{database}].INFORMATION_SCHEMA.TABLES where TABLE_TYPE = 'BASE TABLE'"
df = ddf.read_sql_query(sql=query, con=ssa, if_exists='replace', index_col="name")
I get the following error but I have the feeling that I need to change more in order to make it work:
df = ddf.read_sql_query(sql=query, con=ssa, if_exists=‘replace’, index_col=“name”)
File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\dask\dataframe\io\sql.py:109 in read_sql_query
engine = sa.create_engine(con, **engine_kwargs)File :2 in create_engine
File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\sqlalchemy\util\deprecations.py:281 in warned
return fn(*args, **kwargs) # type: ignore[no-any-return]File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\sqlalchemy\engine\create.py:546 in create_engine
u = _url.make_url(url)File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\sqlalchemy\engine\url.py:842 in make_url
return _parse_url(name_or_url)File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\sqlalchemy\engine\url.py:903 in _parse_url
components[“port”] = int(components[“port”])ValueError: invalid literal for int() with base 10: ‘sql-sf-forestcore-swec.database.windows.net’