When contributing to this repository, please first discuss the change you wish to make via a GitHub issue, email, or any other method with the owners of this repository before making a change.
Please note we have a code of conduct, please follow it in all your interactions with the project.
The scope of sklearn-pmml-model
is to import functionality to all major estimator classes of the popular machine learning library scikit-learn using PMML.
The API is designed to closely resemble the scikit-learn
API. The same directory and component structure is used, and each estimator is a sub-class of a corresponding estimator. Note that some models may not have a scikit-learn
implementation (e.g., Bayesian networks) and hence cannot currently be represented.
We intend for the library to remain as light-weight as possible, and stick with the minimum number of additions to enable PMML import functionality without affecting the outward facing API of estimators.
We use GitHub issues to track all bugs and feature requests; feel free to open an issue if you have found a bug or wish to see a feature implemented.
It is recommended to check that your issue complies with the following rules before submitting:
- Verify that your issue is not being currently addressed by other issues or pull requests.
- Please include code snippets or error messages when reporting issues. When doing so, please make sure to format them using code blocks. See Creating and highlighting code blocks.
- It can often be helpful to include your operating system type and version number, as well as your Python, sklearn-pmml-model, scikit-learn, numpy, and scipy versions. This information can be found by running the following code snippet:
import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)
import sklearn_pmml_model; print("sklearn-pmml-model", sklearn_pmml_model.__version__)
These are the steps you need to take to create a copy of the sklearn-pmml-model
repository on your computer.
-
Create an account on GitHub if you do not already have one.
-
Clone your fork of the
sklearn-pmml-model
repository from your GitHub account. Use a git GUI application (e.g., Sourcetree, GitKraken) or from command line, run:$ git clone [email protected]:iamDecode/sklearn-pmml-model.git $ cd sklearn-pmml-model
-
Create a feature branch to hold your development changes:
$ git checkout -b <username>/<feature description>
(For example:
decode/regression-trees
)
After you created a copy of our main repository on GitHub, your need to setup a local development environment. We recommend creating a virtual environment and activating it:
$ python3 -m venv venv
$ source venv/bin/activate
and install the dependencies within the virtual environment:
$ pip install -r requirements.txt
The final step is to build the Cython extensions (you need to rebuilt once you make changes to the Cython code):
$ python setup.py build_ext --inplace
For pull requests to be accepted, your changes must at least meet the following requirements:
-
All changes related to one feature must belong to one branch. Each branch must be self-contained, with a single new feature or bugfix.
-
Commit messages should be formulated according to Conventional Commits.
-
If your pull request addresses an issue, please make sure to link back to the original issue.
-
Follow the PEP8 style guide. With the following exceptions or additions:
- The max line length is 120 characters instead of 80.
- Indents with double spaces, not 4 spaces or tabs.
You can check for compliance locally by running:
$ flake8 sklearn_pmml_model
-
Each function, class, method, and attribute needs to be documented using docstrings.
sklearn-pmml-model
conforms to the numpy docstring standard. -
Finally, ensure all the test cases still pass after you have made your changes. To test locally, you can run:
$ python setup.py pytest
In addition to these requirements, we strongly prefer you to consider the following guidelines. However, they are not strictly required to not be overly prohibitive to new contributors.
- Your change should include test cases for all new functionality being introduced.
- No additional code style issues should be reported by LGTM.
Continuous integration will automatically verify compliance with all of the discussed requirements.
- When you are done coding in your feature branch, add changed or new files:
$ git add path/to/modified_file
- Create a commit with a message describing what you changed. Commit messages should be formulated according to Conventional Commits standard:
$ git commit
- Push the changes to GitHub:
$ git push -u origin my_feature
- Create a pull request.