Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

I have made some minor changes in the comments of the code and some modifications in documentation. It is very useful when we have a crystal clear comments to explain the code which makes it well documented and clean code. #5697

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ $ pip install nni

To update NNI to the latest version, add `--upgrade` flag to the above commands.

## NNI capabilities in a glance
## NNI capabilities at a glance

<img src="docs/img/overview.svg" width="100%"/>

Expand Down Expand Up @@ -220,20 +220,20 @@ To update NNI to the latest version, add `--upgrade` flag to the above commands.

## Contribution guidelines

If you want to contribute to NNI, be sure to review the [contribution guidelines](https://nni.readthedocs.io/en/stable/notes/contributing.html), which includes instructions of submitting feedbacks, best coding practices, and code of conduct.
If you want to contribute to NNI, be sure to review the [contribution guidelines](https://nni.readthedocs.io/en/stable/notes/contributing.html), which includes instructions for submitting feedback, best coding practices, and code of conduct.

We use [GitHub issues](https://github.com/microsoft/nni/issues) to track tracking requests and bugs.
Please use [NNI Discussion](https://github.com/microsoft/nni/discussions) for general questions and new ideas.
For questions of specific use cases, please go to [Stack Overflow](https://stackoverflow.com/questions/tagged/nni).
For questions about specific use cases, please go to [Stack Overflow](https://stackoverflow.com/questions/tagged/nni).

Participating discussions via the following IM groups is also welcomed.
Participating in discussions via the following IM groups is also welcomed.

|Gitter||WeChat|
|----|----|----|
|![image](https://user-images.githubusercontent.com/39592018/80665738-e0574a80-8acc-11ea-91bc-0836dc4cbf89.png)| OR |![image](https://github.com/scarlett2018/nniutil/raw/master/wechat.png)|

Over the past few years, NNI has received thousands of feedbacks on GitHub issues, and pull requests from hundreds of contributors.
We appreciate all contributions from community to make NNI thrive.
We appreciate all contributions from the community to make NNI thrive.

<img src="https://img.shields.io/github/contributors-anon/microsoft/nni"/>

Expand Down Expand Up @@ -266,15 +266,15 @@ We appreciate all contributions from community to make NNI thrive.

## Related Projects

Targeting at openness and advancing state-of-art technology, [Microsoft Research (MSR)](https://www.microsoft.com/en-us/research/group/systems-and-networking-research-group-asia/) had also released few other open source projects.
Targeting openness and advancing state-of-art technology, [Microsoft Research (MSR)](https://www.microsoft.com/en-us/research/group/systems-and-networking-research-group-asia/) has also released a few other open-source projects.

* [OpenPAI](https://github.com/Microsoft/pai) : an open source platform that provides complete AI model training and resource management capabilities, it is easy to extend and supports on-premise, cloud and hybrid environments in various scale.
* [FrameworkController](https://github.com/Microsoft/frameworkcontroller) : an open source general-purpose Kubernetes Pod Controller that orchestrate all kinds of applications on Kubernetes by a single controller.
* [MMdnn](https://github.com/Microsoft/MMdnn) : A comprehensive, cross-framework solution to convert, visualize and diagnose deep neural network models. The "MM" in MMdnn stands for model management and "dnn" is an acronym for deep neural network.
* [SPTAG](https://github.com/Microsoft/SPTAG) : Space Partition Tree And Graph (SPTAG) is an open source library for large scale vector approximate nearest neighbor search scenario.
* [nn-Meter](https://github.com/microsoft/nn-Meter) : An accurate inference latency predictor for DNN models on diverse edge devices.
* [OpenPAI](https://github.com/Microsoft/pai): an open-source platform that provides complete AI model training and resource management capabilities, it is easy to extend and supports on-premise, cloud, and hybrid environments in various scales.
* [FrameworkController](https://github.com/Microsoft/frameworkcontroller): an open-source general-purpose Kubernetes Pod Controller that orchestrates all kinds of applications on Kubernetes by a single controller.
* [MMdnn](https://github.com/Microsoft/MMdnn): A comprehensive, cross-framework solution to convert, visualize and diagnose deep neural network models. The "MM" in MMdnn stands for model management and "dnn" is an acronym for deep neural network.
* [SPTAG](https://github.com/Microsoft/SPTAG): Space Partition Tree And Graph (SPTAG) is an open-source library for large-scale vector approximate nearest neighbor search scenarios.
* [nn-Meter](https://github.com/microsoft/nn-Meter): An accurate inference latency predictor for DNN models on diverse edge devices.

We encourage researchers and students leverage these projects to accelerate the AI development and research.
We encourage researchers and students to leverage these projects to accelerate AI development and research.

## License

Expand Down
12 changes: 6 additions & 6 deletions docs/source/feature_engineering/gbdt_selector.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ For now, we support the ``importance_type`` is ``split`` and ``gain``. But we wi
Usage
^^^^^

First you need to install dependency:
First you need to install the dependency:

.. code-block:: bash

Expand All @@ -32,8 +32,8 @@ Then
fgs = GBDTSelector()
# fit data
fgs.fit(X_train, y_train, ...)
# get improtant features
# will return the index with important feature here.
# get important features
# will return the index with an important feature here.
print(fgs.get_selected_features(10))

...
Expand All @@ -53,7 +53,7 @@ And you could reference the examples in ``/examples/feature_engineering/gbdt_sel
**lgb_params** (dict, require) - The parameters for lightgbm model. The detail you could reference `here <https://lightgbm.readthedocs.io/en/latest/Parameters.html>`__

*
**eval_ratio** (float, require) - The ratio of data size. It's used for split the eval data and train data from self.X.
**eval_ratio** (float, require) - The ratio of data size. It's used to split the eval data and train data from self.X.

*
**early_stopping_rounds** (int, require) - The early stopping setting in lightgbm. The detail you could reference `here <https://lightgbm.readthedocs.io/en/latest/Parameters.html>`__.
Expand All @@ -62,9 +62,9 @@ And you could reference the examples in ``/examples/feature_engineering/gbdt_sel
**importance_type** (str, require) - could be 'split' or 'gain'. The 'split' means ' result contains numbers of times the feature is used in a model' and the 'gain' means 'result contains total gains of splits which use the feature'. The detail you could reference in `here <https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.Booster.html#lightgbm.Booster.feature_importance>`__.

*
**num_boost_round** (int, require) - number of boost round. The detail you could reference `here <https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.train.html#lightgbm.train>`__.
**num_boost_round** (int, require) - number of boost rounds. The detail you could reference `here <https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.train.html#lightgbm.train>`__.

**Requirement of get_selected_features FuncArgs**


* **topk** (int, require) - the topK impotance features you want to selected.
* **topk** (int, require) - the topK important features you want to select.
18 changes: 9 additions & 9 deletions docs/source/feature_engineering/gradient_feature_selector.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,29 +24,29 @@ Usage
...
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

# initlize a selector
# initialize a selector
fgs = FeatureGradientSelector(n_features=10)
# fit data
fgs.fit(X_train, y_train)
# get improtant features
# will return the index with important feature here.
# get important features
# will return the index with an important feature here.
print(fgs.get_selected_features())

...

And you could reference the examples in ``/examples/feature_engineering/gradient_feature_selector/``\ , too.
And you could reference the examples in ``/examples/feature_engineering/gradient_feature_selector/``\, too.

**Parameters of class FeatureGradientSelector constructor**


*
**order** (int, optional, default = 4) - What order of interactions to include. Higher orders may be more accurate but increase the run time. 12 is the maximum allowed order.
**order** (int, optional, default = 4) - What order of interactions to include? Higher orders may be more accurate but increase the run time. 12 is the maximum allowed order.

*
**penalty** (int, optional, default = 1) - Constant that multiplies the regularization term.

*
**n_features** (int, optional, default = None) - If None, will automatically choose number of features based on search. Otherwise, the number of top features to select.
**n_features** (int, optional, default = None) - If None, will automatically choose a number of features based on search. Otherwise, the number of top features to select.

*
**max_features** (int, optional, default = None) - If not None, will use the 'elbow method' to determine the number of features with max_features as the upper limit.
Expand All @@ -64,7 +64,7 @@ And you could reference the examples in ``/examples/feature_engineering/gradient
**shuffle** (bool, optional, default = True) - Shuffle "rows" prior to an epoch.

*
**batch_size** (int, optional, default = 1000) - Nnumber of "rows" to process at a time.
**batch_size** (int, optional, default = 1000) - N number of "rows" to process at a time.

*
**target_batch_size** (int, optional, default = 1000) - Number of "rows" to accumulate gradients over. Useful when many rows will not fit into memory but are needed for accurate estimation.
Expand Down Expand Up @@ -94,10 +94,10 @@ And you could reference the examples in ``/examples/feature_engineering/gradient


*
**X** (array-like, require) - The training input samples which shape = [n_samples, n_features]. `np.ndarry` recommended.
**X** (array-like, require) - The training input samples which shape = [n_samples, n_features]. `np.ndarray` recommended.

*
**y** (array-like, require) - The target values (class labels in classification, real numbers in regression) which shape = [n_samples]. `np.ndarry` recommended.
**y** (array-like, require) - The target values (class labels in classification, real numbers in regression) which shape = [n_samples]. `np.ndarray` recommended.

*
**groups** (array-like, optional, default = None) - Groups of columns that must be selected as a unit. e.g. [0, 0, 1, 2] specifies the first two columns are part of a group. Which shape is [n_features].
Expand Down
Loading