The master
branch has been renamed to main
on 01.31.2023 according to Khronos naming policy. Use the following commands to rename your local repository branch:
git branch -m master main
git fetch origin
git branch -u origin/main main
git remote set-head origin -a
NNEF reduces machine learning deployment fragmentation by enabling a rich mix of neural network training tools and inference engines to be used by applications across a diverse range of devices and platforms.
This repository contains tools to generate and consume NNEF documents, such as a parser (C++ and Python) that can be included in consumer applications and converters for deep learning frameworks.
A Model Zoo is now available; the 'models' folder contains a variety of NNEF models converted from various sources.
NNEF Tools folder contains tools to convert pre-trained models in tensorFlow
/caffe
/caffe2
/ONNX
to NNEF format.
NNEF Parser folder contains C++
and Python
source code for a sample NNEF graph parser.
Following the update of the NNEF specification to version 1.0.4, conversion for the corresponding operators has been added. Furthermore, error handling of non-convertible models has been greately enhanced with error messages detailing the exact cause of failure listed for all non-convertible operations before conversion is started.
The tools for converting models to NNEF and transforming NNEF models has been thoroughly reworked to make them more robust and unified and easier to maintain. The basic functionality of the main scripts has been kept, however their parameterization has been simplified and unified in some places; please refer to the readme and the help (-h
option) of the respective scripts for more details. The scripts cover the following major areas of functionality: model conversion, optimization, execution and visualization. A GMAC calculator is also provide, and further utility scripts may be added in the future.
According to the change in version 1.0.3 of the NNEF specification, quantization algorithm information has been deprecated in the tensor binary file format. The tensor binary only stores the item-type of the tensor data, and the binary reader does not return quantization information (also used to be called 'compression' info). Furthermore, the mapping between stored item-types and data-types in the structural description has been clarified, so that the reader of a tensor binary can tell what the data-type of the read tensor is. This enhances the reader as it can now properly map the binary data to C++ or Python numpy types upon reading. The C++ code has been updated to perform such a mapping, and is now able to return a typed array instead of just plain bytes.
According to a change in version 1.0.1 of the NNEF specification, the shape_of
operator in NNEF syntax is deprecated, and the parser does not support it. This enables the decoupling of parsing from shape inference, allowing parsing to succeed even if shape information is not available for all operations, such as custom defined operations before the graph definition. Shape inference can still be run after training, furthermore it can be customized (via function pointers) for custom defined operations.
There was a bug in the Python code that reads/writes the tensor binary files (the header contained 4 extra padding bytes therefore not conforming to the spec). The code has been updated to read/write and check the proper header size. As a consequence, any files written out with the code that contained the bug cannot be read back with the updated code. To aid the usage of such existing files, a script was created called fix_nnef_binary_size.py
that can be used to remove the excess 4 bytes from existing NNEF files. The script is located in the root folder of this repo, it has no dependencies (not even the NNEF parser). It can be run on the main folder of an NNEF model, and it fixes all binary files in the folder. In case one runs it on an NNEF model that does not contain the bug, it does nothing. It can be used as follows:
python fix_nnef_binary_size.py my_nnef_model_folder
Such an invocation fixes the files in place. Optionally, a second argument can be supplied to the script to write the fixed files to a different output path. In this case, the script copies all non-binary files (such as graph.nnef) to the target folder, so the resulting folder contains the whole valid model.