Dask read_sql to Microsoft SQL Server with Microsoft Entra authentication

I have a working call to a MS SQL Server using pyodbc + pandas. I was looking to speed it up with dask-sql, but I don’t manage to find an equivalent that works. My organization requires us to use Microsoft Entra MFA for identification, which might be an issue.

This is the pyodbc + pandas call:

import pyodbc
import pandas as pd

server = "tcp:sql-our-database-swec.database.windows.net"
database = "sqldb-OurDatabase-prod"
driver = "{ODBC Driver 18 for SQL Server}" 
un = "{my@email.com}"
authentication = "ActiveDirectoryInteractive"

s = f"Driver={driver};Server={server};Port=1433;Database={database};Uid={un};Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;Authentication={authentication}"
cnxn = pyodbc.connect(s)        
cursor = cnxn.cursor()
query = f"select table_name from [{database}].INFORMATION_SCHEMA.TABLES where TABLE_TYPE = 'BASE TABLE'"
table_df = pd.read_sql(query, cnxn)

This is my dask-sql call:

import dask.dataframe as ddf

server = "tcp:sql-our-database-swec.database.windows.net"
database = "sqldb-OurDatabase-prod"
driver_plus = "ODBC+Driver+18+for+SQL+Server"
un = "{my@email.com}"
authentication = "ActiveDirectoryInteractive"

ssa = f"mssql+pyodbc://{server}/{database}?driver={driver_plus}&Authentication={authentication}?Trusted_Connection=yes"
query = f"select table_name from [{database}].INFORMATION_SCHEMA.TABLES where TABLE_TYPE = 'BASE TABLE'"
df = ddf.read_sql_query(sql=query, con=ssa, if_exists='replace', index_col="name")

I get the following error but I have the feeling that I need to change more in order to make it work:

df = ddf.read_sql_query(sql=query, con=ssa, if_exists=‘replace’, index_col=“name”)

File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\dask\dataframe\io\sql.py:109 in read_sql_query
engine = sa.create_engine(con, **engine_kwargs)

File :2 in create_engine

File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\sqlalchemy\util\deprecations.py:281 in warned
return fn(*args, **kwargs) # type: ignore[no-any-return]

File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\sqlalchemy\engine\create.py:546 in create_engine
u = _url.make_url(url)

File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\sqlalchemy\engine\url.py:842 in make_url
return _parse_url(name_or_url)

File C:\Program Files\Miniconda3\envs\rot_env\lib\site-packages\sqlalchemy\engine\url.py:903 in _parse_url
components[“port”] = int(components[“port”])

ValueError: invalid literal for int() with base 10: ‘sql-sf-forestcore-swec.database.windows.net

Looks like SQLAlchemy does not like the tcp:.

What happens if you remove it?

I’m not sure if there is an easy way to convert pyodbc settings to sql alchemy URI.