Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errata References - how to process and what to include? #159

Open
eherget opened this issue Mar 21, 2018 · 7 comments
Open

Errata References - how to process and what to include? #159

eherget opened this issue Mar 21, 2018 · 7 comments
Assignees
Labels
question Further information is requested

Comments

@eherget
Copy link
Contributor

eherget commented Mar 21, 2018

  1. updateinfo contains references of 4 types. The "self" type seems to be a URL pointing to info about the erratum. However, I noticed many contain URLs pointing to rhn.redhat.com. So I left in place the code in webapp that sets and erratum's url field to "https://access.redhat.com/errata/%s" % erratum_name. I am not storing the "self" type references found in updateinfo.

  2. I added a table (errata_refs) that contains updateinfo references of types "other" and "bugzilla". I was not able to move "cve" types into this table because they need to have the foreign key constraint to the cve table (since CWE's also map to CVE's).

  3. Several of the "other" type references did not have id's. The ones that did not have id's looked to be url references to other docs of some sort. The ones that did have id's had id's with values like "classification", "ref_0", "ref_2", and these matched the errata api example in the errata API details document. So in reposcan, only "other" type references that have non-null id's are stored in the errata_refs table (in addition to all "bugzilla" types), and the name is constructed by appending "-" + erratum_name to the value found in the reference id field. For example "classification-RHSA-2017:1931" or "ref_0-RHSA-2017:1931".

@eherget eherget added the question Further information is requested label Mar 21, 2018
@eherget
Copy link
Contributor Author

eherget commented Mar 21, 2018

Regarding point 1 above, jdobes replied:

This is good enough for all Red Hat content and main goal of this project but you can also sync Fedora or EPEL repos using reposcan and for such third-party repos link to access.redhat.com doesn't make sense. However, it's low prio issue right now.

@eherget
Copy link
Contributor Author

eherget commented Mar 21, 2018

Regarding point 3 above, jdobes replied:

How useful is information like "classification-RHSA-2017:1931" or "ref_0-RHSA-2017:1931"? For "classification" references is in updateinfo also href value - e.g. http://www.redhat.com/security/updates/classification/#normal and for "ref" references for example link to documentation - e.g. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.9_Technical_Notes/index.html

Wouldn't make much more sense to store and return these URLs instead of useless "classification-RHSA-2017:1931" or "ref_0-RHSA-2017:1931" strings?

@eherget
Copy link
Contributor Author

eherget commented Mar 21, 2018

My response to jdobes regarding point 1:

I'll open an issue for this one. We can decide later how exactly to handle rhn urls in "self" references vs. support for non-Red Hat content. If we are going to change it, we might want to at least let some people know. They might build something that uses the errata url and will be surprized if we change to "self" reference urls and start sending them to rhn.redhat.com.

@eherget
Copy link
Contributor Author

eherget commented Mar 21, 2018

My response to jdobes regarding point 3:

I initially thought the same thing and considered storing all 4 references fields in our db (id, title, href, type). However the Errata API Details document showed only id's in the responses for all types of list data. The general M.O. seemed to be just return information 1 layer deep, so that relationships to other data were expressed only as a unique id that could be used to query something else to find that data.

For things like bugzillas and "other" references, where our API doesn't provide more details about those things, maybe it is appropriate to include the href/url with it.

It would not take much effort to include href/urls in the errata_refs table and include them in the errata api response. We would have to update the docs. If you think its worthwhile now, I can get it done in a few hours, including the Errata API Details documentation update.

@eherget
Copy link
Contributor Author

eherget commented Mar 21, 2018

Let's continue this discussion here in this issue.

I left point 1 and point 3 in the same discussion because both points are about errata references that come from updateinfo.

@eherget
Copy link
Contributor Author

eherget commented Mar 21, 2018

As was pointed out during our scrum call discussion, the "other" type references found in updateinfo include some with ids ("classification", "ref_0", "ref_1"...) and some without ids. Those without ids appear to be urls/hrefs to other information relevant to the errata. I would think that might be useful for the errata api to return... it currently does not return anything about these non-id references.

My current thinking is that we should store all reference fields - "title", "href", "id" and "type" in the errata_refs table and return all of them in the bugzilla_list field (for "type" == "bugzilla") and the reference_list field (for "type" == "other").

I'm not sure what the best path is for the references of "type" == "self". I did some analysis when implementing the references parsing in reposcan and found there was never more than 1 reference for an erratum with "type" == "self". If there is a process by which an existing erratum's "self" reference can be updated to change it from an obsolete rhn.redhat.com url to a new, valid one, then using the "self" reference for the erratum's url field might be a mechanism by which obsolete url's are identified and fixed.

@MichaelMraka
Copy link
Contributor

My current thinking is that we should store all reference fields - "title", "href", "id" and "type" in the errata_refs table and return all of them in the bugzilla_list field (for "type" == "bugzilla") and the reference_list field (for "type" == "other").

Yes, I'd do it the same way.

As for "self" I'd store the reference in the database (not generate it in webapp) because it will allow us to store non-redhat errata in the future without code changes.

@eherget eherget self-assigned this Apr 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants