-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing the get_pmcode error #18
Conversation
This commit addresses issue #17
builds are failing because of a linting error. https://github.com/UCHIC/dataretrieval/runs/5875910660?check_suite_focus=true It'd be nice if the repo had a script to reformat files, you might have to do it manually or setup the reformatter yourself. |
dataretrieval/nwis.py
Outdated
'srsname_search' : None, | ||
'show' : ['parameter_group_nm', 'casrn', 'srsname','parameter_units', 'parameter_nm'], | ||
'format' : 'rdb'} | ||
url = 'https://help.waterdata.usgs.gov/code/parameter_cd_nm_query?' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
URLs are configured at the top of the page. This line is also duplicated at line 466.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll define these url's at the top of the page. These (the one in line 454 and 466) are different urls and used only for this function.
dataretrieval/nwis.py
Outdated
if parameterCd is not None and parameterNm is not None: | ||
raise TypeError('Query must specify a parameter name or number, not both)') | ||
|
||
if parameterNm is None and parameterCd is not None: # querying based on a parameter code or list of codes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have already checked that only parameterCd or parameterNm is defined above. No need to check check again. Just check if parameterNm is defined. It will be easier to read and follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes perfect sense, I'll fix this, and the 'else' you mention below.
dataretrieval/nwis.py
Outdated
@@ -436,9 +436,9 @@ def _iv(**kwargs): | |||
return _read_json(response.json()), _set_metadata(response, **kwargs) | |||
|
|||
|
|||
def get_pmcodes(parameterCd='All', **kwargs): | |||
def get_pmcodes(parameterCd = None, parameterNm = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires a parameter to be defined to run now. It could break programs that expect to be able to run this function with no parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove kwargs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not certain what the difference is between parameterCd and parameterNm. parameterCd can be a list or a string. If it is All
then {'group_cd':'%'}
is in the payload. In all other scenarios {'parm_nm_cd':param}
is added to the payload, unless parameterNm is used, then {'parm_nm_cd':f'%{param}%'}
, for partial matches. Why are partial matches checked for only paramterNm? What are the differences between parameterNm and parameterCd? Functionally the only difference I see is partial matches.
Seems like it could be simplified but there is some functionality that is not clear to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it requires a parameter to run. I want to get Jeff opinion about the best way to manage this. Currently, some functions return nothing and others return all when using the default query because for some functions querying all is slow.
I removed kwargs becasue there is no additional argument that can be used to query with the new service, it is either parameter code, or name. When querying all, we use a different url.
parameterCd allows to query based on a parameter code, or a list of parameter codes. These numbers are introduced as a string or a list of strings (some of these codes start with 0).
parameterNm allows to query using names (e.g., discharge).
Is there a better way to allow querying from name and code in the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to do partial matches for codes, I didn't think it woudl be useful, but I'll check on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so the nwis api has a parameter that can take a code or a name. You're splitting it up, which could make it easier for users. Some documentation around the parameters beyond data type could be make this clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider whether we always want to do partial matches with no ability to turn it off. I'd try to make it as close to the API as possible. When making it easier for users, try not to sacrifice functionality without reason.
dataretrieval/nwis.py
Outdated
l.append(_read_rdb(response.text)) | ||
return pd.concat(l), _set_metadata(response) | ||
|
||
if parameterNm is not None and parameterCd is None: # querying based on a parameter name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just be an else.
dataretrieval/nwis.py
Outdated
|
||
if parameterNm is not None and parameterCd is None: # querying based on a parameter name | ||
parameterNm ='%{0}%'.format(parameterNm) # update to include partial matches | ||
payload = {'parm_nm_cd':parameterNm,'fmt':'rdb'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{'fmt': 'rdb'}
is always part of the payload. Set it only once, maybe just before running the query below.
response = query(url, payload) | ||
return _read_rdb(response.text), _set_metadata(response, **kwargs) | ||
if len(response.text.splitlines()) < 10: # empty query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check a response code instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can, the url just returns the same header with no information when there is nothing to return.
The tests are still failing with flake8. You need to reformat the file nwis.py. |
This commit addresses issue #17