-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cyclic foreign vars #177
cyclic foreign vars #177
Conversation
proof-of-concept PR for allowing user-defined custom dependencies,
the arrows from inout still also show up if the process that uses the inout var is set to explicitly depend on it |
So, we probably still have to move the check to a place where we can actually view the model. For the checking, I used a dictionary of |
def add_inout_arrows(self): | ||
for p_name, p_obj in self.model._processes.items(): | ||
p_cls = type(p_obj) | ||
for var_name, var in variables_dict(p_cls).items(): | ||
# test if the variable is inout | ||
if ( | ||
var.metadata["intent"] == VarIntent.INOUT | ||
and var.metadata["var_type"] != VarType.FOREIGN | ||
): | ||
target_keys = _get_target_keys(p_obj, var_name) | ||
|
||
# now again cycle through all processes to see if there is a variable with the same reference | ||
for p2_name, p2_obj in self.model._processes.items(): | ||
p2_cls = type(p2_obj) | ||
|
||
# skip this if it is a dependent process or the process itself | ||
if ( | ||
p_name in self.model.dependent_processes[p2_name] | ||
or p_name == p2_name | ||
): | ||
continue | ||
|
||
for var2_name, var2 in variables_dict(p2_cls).items(): | ||
# if the variable is | ||
target2_keys = _get_target_keys(p2_obj, var2_name) | ||
if len(set(target_keys) & set(target2_keys)): | ||
edge_ends = p_name, p2_name | ||
self.g.edge( | ||
*edge_ends, weight="200", **INOUT_EDGE_ATTRS | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need some work still, would be much easier if we have the deps_dict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @joeperdefloep for tackling this, that's awesome!!
This is a very important update in xarray-simlab, so I'd suggest to break this PR into several ones if you don't mind, at least one for each of the following:
- add user-defined process dependencies
- update how the graph is shown (
dot.py
) - implement and/or expose graph reduction
It will be much easier for me to review this (it's gonna take me some time too).
return self._dep_processes | ||
|
||
def _check_inout_vars(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's gonna take a little while for me to digest all of this :-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course, we can have multiple variables that have an inout somewhere, so we make two dicts with inout an in variables and sets of processes in which that variable is in/inout, so we can run the algorithm on every variable separately.
from now, for one variable:
so a good graph should look like:
inout1->inout2->inout3
/ \ ^ \ ^ \
/ \ / \ / \
in1 in2 in3 in4
in_ps = {in1,in2,in3,in4}, io_ps = {inout1,inout2,inout3}
let's say we start at inout2:
- io_stack = [inout2]
- since inout1 is in deps_dict[inout2], and not in verified_ios=[], we add it to the stack
- io_stack =[inout2,inout1]
- since there are no inouts in deps_dict[inout1], we are at the root node.
a. all in variables in deps_dict[inout1] are safe, so delete them: in_ps = {in2,in3,in4}
b. add to verified_ios = [inout1]
c. pop: io_stack=[inout2] - inout1 is in deps_dict[inout2] = {inout1} == set(verified_ios) so we have a linear graph in inout variables
a. child in variables: {in2}
b. check for in2 if verified_io[-1] = inout1 is in deps_dict[in2], e.g. in2 comes after inout1
c. add to verified_ios = [inout1,inout2]
d. remove child in variables: in_ps = {in3,in4}
e. pop: io_stack = []
now start again from inout3:
- io_stack = [inout3]
{inout1,inout2}
in deps_dict[inout3], equals set(verified_ios), so safe (same as 10)
a. for all child in processes: {in3} check if verified_ios[-1]=inout2 in deps_dict[in3] -> good
b. verified_ios=[io1,io2,io3]
b. remove child in variables: in_ps={in4}
c. pop io_stack=[]
now start from inout1: in verified_ios, so skip
final check:
- all remaining in_ps={in4}, check for inout4 in in_deps[in4]->good
I think the errors will become more clear if I write some tests
|
||
return descendants | ||
|
||
def transitive_reduction(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said in #164 (comment), I don't think we should apply transitive reduction on the graph, as it may loose some information that is potentially useful to know. Consider the following graph of processes:
/->-\
A->B->C
It's good to know that if we remove B
there will still be a dependency between A
and C
.
Let me show a real-world example taken from fastscape:
without reduction:
with reduction:
While the reduced graph looks much cleaner, there's a lot of hidden relationships and it's hard to figure out that, e.g., bedrock
depends directly on init_bedrock
.
I'd rather keep this feature disabled by default and allow enabling it via several options:
- a model visualization option:
model.visualize(reduce_graph=True)
- a Model method, e.g.,
reduced_model = model.reduce_graph()
(let's keep models immutable) - a Model constructor option, e.g.,
model = xs.Model({...}, reduce_graph=True)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now it's an option at model creation
xsimlab/model.py
Outdated
@@ -1108,7 +1293,7 @@ def drop_processes(self, keys): | |||
processes_cls = { | |||
k: type(obj) for k, obj in self._processes.items() if k not in keys | |||
} | |||
return type(self)(processes_cls) | |||
return type(self)(processes_cls, self._custom_dependencies) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this won't be as simple as that, but we can look at this in follow-up PRs if we're sure that errors are raised in case of any inconsistency or ambiguity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @joeperdefloep for tackling this, that's awesome!!
This is a very important update in xarray-simlab, so I'd suggest to break this PR into several ones if you don't mind, at least one for each of the following:
* add user-defined process dependencies * update how the graph is shown (`dot.py`) * implement and/or expose graph reduction
graph reduction (now) heavily depends on the deps_dict
(and is really just a few lines), so I don't think that shoudl be split.
how the graph is shown turns out to become more difficult: the arrows should be only from the topmost inout
variable to the lowest in
variables. For this we need the deps_dict and all inout/in variables (the same data as for the sorting algorithm, so we even need to perform a DFS)
Besides the implementation, we still have to decide how to expose this feature, whether to enable it or not by default, etc. Also, it's not yet very clear to me what is the motivation behind the dashed arrows for inout variables. Honestly, it's too much for me to follow and discuss in a single PR. Adding user-defined dependencies is a already a big feature that has a high potential to make the framework more flexible, so I'd really appreciate if we could progress incrementally to make sure that we get the design right and the feature robust. |
I feel guilty having started this particular feature discussion :) So let me give a practical example from the model I am working on (it's a bit messy right now - but work-in-progress): The process |
Ok. I get that, I can move the reduction to another PR, as well as maybe adding custom dependencies.
This will quite test my Git skills.... @jvail no worries, the main part of this PR is about enabling (and verifying) |
Thanks for the illustration @jvail. I was getting confused because we (I feel guilty too) involve two separate problems in the discussion here and in #164:
Problem 1 is what I thought @joeperdefloep you describe in your #164 (comment) (as you mentioned Fastscape's Problem 2 is what @jvail you initially describe in #164. The current workaround is to declare an That said, for problem 2 you'll still need an |
@jvail the difficulty:
then, we want arrows only from |
@benbovy I think I will close this PR until I have properly split it |
Yeah sorry @joeperdefloep a fresh start in new branches is probably the best thing to do. What we need first is a PR that
In the same PR or the next one:
Then in follow-up PRs:
|
black . && flake8
whats-new.rst
for all changes andapi.rst
for new API