when calling set_index()
on a dataframe without specifying the division, Dask triggers a computation right away. This is fine. But it’s using a “threaded” client instead of the properly one we created with Client(address=...)
. I think the reason is that we set set_as_default
to False when creating the proper client. I do have the proper object at hand. How can I tell set_index()
to use it?
Also about not to use set_as_default
, it’s a decision previously made in my team(I’ll have more discussion on the why). But it seems with that set to False, there are surprises when you call functions like df.compute()
or here df.set_index()
, for the prior we can still specify df.compute(scheduler=our_client)
. Do you generally recommend that we should leave set_as_default to True anyways?