Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How Exactly Do I Use CRF with this Library? #1769

Closed
tonychenxyz opened this issue May 2, 2020 · 12 comments
Closed

How Exactly Do I Use CRF with this Library? #1769

tonychenxyz opened this issue May 2, 2020 · 12 comments

Comments

@tonychenxyz
Copy link

I saw a few merge requests that contains CRF not being passed. I also didn't find any documentation on CRF. I was confused. I would really appreciate if someone could help. Thanks!

@bhack
Copy link
Contributor

bhack commented May 2, 2020

Is this a dup of #337?

@SeanLee97
Copy link

SeanLee97 commented May 21, 2020

@tonychenxyz It seems like that the CRF layer has been removed from tensorflow_addons in the latest version. here is my solution based on nlp-architect to use CRF in Keras way.

lstm_crf.py

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Embedding, Bidirectional, LSTM, Dense

from crf import CRF


num_labels = 10
embedding_size = 100
hidden_size = 128

model = Sequential()
model.add(Embedding(num_labels, embedding_size, mask_zero=True))
model.add(Bidirectional(LSTM(hidden_size, return_sequences=True)))
model.add(Dense(num_labels))

crf = CRF(num_labels, sparse_target=True)
model.add(crf)
model.compile('adam', loss=crf.loss, metrics=[crf.accuracy])

crf.py

import tensorflow as tf
import tensorflow.keras.backend as K
import tensorflow.keras.layers as L
from tensorflow_addons.text import crf_log_likelihood, crf_decode


class CRF(L.Layer):
    def __init__(self,
                 output_dim,
                 sparse_target=True,
                 **kwargs):
        """    
        Args:
            output_dim (int): the number of labels to tag each temporal input.
            sparse_target (bool): whether the the ground-truth label represented in one-hot.
        Input shape:
            (batch_size, sentence length, output_dim)
        Output shape:
            (batch_size, sentence length, output_dim)
        """
        super(CRF, self).__init__(**kwargs)
        self.output_dim = int(output_dim) 
        self.sparse_target = sparse_target
        self.input_spec = L.InputSpec(min_ndim=3)
        self.supports_masking = False
        self.sequence_lengths = None
        self.transitions = None

    def build(self, input_shape):
        assert len(input_shape) == 3
        f_shape = tf.TensorShape(input_shape)
        input_spec = L.InputSpec(min_ndim=3, axes={-1: f_shape[-1]})

        if f_shape[-1] is None:
            raise ValueError('The last dimension of the inputs to `CRF` '
                             'should be defined. Found `None`.')
        if f_shape[-1] != self.output_dim:
            raise ValueError('The last dimension of the input shape must be equal to output'
                             ' shape. Use a linear layer if needed.')
        self.input_spec = input_spec
        self.transitions = self.add_weight(name='transitions',
                                           shape=[self.output_dim, self.output_dim],
                                           initializer='glorot_uniform',
                                           trainable=True)
        self.built = True

    def compute_mask(self, inputs, mask=None):
        # Just pass the received mask from previous layer, to the next layer or
        # manipulate it if this layer changes the shape of the input
        return mask

    def call(self, inputs, sequence_lengths=None, training=None, **kwargs):
        sequences = tf.convert_to_tensor(inputs, dtype=self.dtype)
        if sequence_lengths is not None:
            assert len(sequence_lengths.shape) == 2
            assert tf.convert_to_tensor(sequence_lengths).dtype == 'int32'
            seq_len_shape = tf.convert_to_tensor(sequence_lengths).get_shape().as_list()
            assert seq_len_shape[1] == 1
            self.sequence_lengths = K.flatten(sequence_lengths)
        else:
            self.sequence_lengths = tf.ones(tf.shape(inputs)[0], dtype=tf.int32) * (
                tf.shape(inputs)[1]
            )

        viterbi_sequence, _ = crf_decode(sequences,
                                         self.transitions,
                                         self.sequence_lengths)
        output = K.one_hot(viterbi_sequence, self.output_dim)
        return K.in_train_phase(sequences, output)

    @property
    def loss(self):
        def crf_loss(y_true, y_pred):
            y_pred = tf.convert_to_tensor(y_pred, dtype=self.dtype)
            log_likelihood, self.transitions = crf_log_likelihood(
                y_pred,
                tf.cast(K.argmax(y_true), dtype=tf.int32) if self.sparse_target else y_true,
                self.sequence_lengths,
                transition_params=self.transitions,
            )
            return tf.reduce_mean(-log_likelihood)
        return crf_loss

    @property
    def accuracy(self):
        def viterbi_accuracy(y_true, y_pred):
            # -1e10 to avoid zero at sum(mask)
            mask = K.cast(
                K.all(K.greater(y_pred, -1e10), axis=2), K.floatx())
            shape = tf.shape(y_pred)
            sequence_lengths = tf.ones(shape[0], dtype=tf.int32) * (shape[1])
            y_pred, _ = crf_decode(y_pred, self.transitions, sequence_lengths)
            if self.sparse_target:
                y_true = K.argmax(y_true, 2)
            y_pred = K.cast(y_pred, 'int32')
            y_true = K.cast(y_true, 'int32')
            corrects = K.cast(K.equal(y_true, y_pred), K.floatx())
            return K.sum(corrects * mask) / K.sum(mask)
        return viterbi_accuracy

    def compute_output_shape(self, input_shape):
        tf.TensorShape(input_shape).assert_has_rank(3)
        return input_shape[:2] + (self.output_dim,)

    def get_config(self):
        config = {
            'output_dim': self.output_dim,
            'sparse_target': self.sparse_target,
            'supports_masking': self.supports_masking,
            'transitions': K.eval(self.transitions)
        }
        base_config = super(CRF, self).get_config()
        return dict(base_config, **config)

@KacperKubara
Copy link

I think that CRF layer implementation had few unsuccessful PRs. #1733 seems to be a current one but it hasn't been merged yet. Better docs and tutorials would be really useful for all crf functionalities within tfa.text as it seems to be lacking.

@ravi0912
Copy link

@SeanLee97 Hey, I tried your code and it works fine for me. But after saving the model while loading it again I'm facing the problem.
tf.keras.models.load_model("tf_ner_model/tf_ner_2020_7_29.h5", custom_objects={"CRF": CRF})
I'm getting the below error:
TypeError: __init__() missing 1 required positional argument: 'num_classes'
Am I loading the model correctly?

@SeanLee97
Copy link

@ravi0912
It seems like that the parameter names in get_config() is not corresponding to the __init__() which leads to failure to initialization. I have uniformed the parameter names. You can try again with the new version CRF

@howl-anderson
Copy link
Contributor

@SeanLee97 I and other developers discuss this type of implementation a few months ago. Unfortunately,

This solution has a shortage that this model can not be save and load from disk anymore.

You can find the docs I write in the file "design_docs/crf.md" of #377

@ravi0912
Copy link

@SeanLee97 I tried again and faced the same problem,
Just building the model and loading the weights worked for me.

@seanpmorgan
Copy link
Member

Consolidating this with #337. We need to create an updated example with the new layer

@ngoquanghuy99
Copy link

@SeanLee97 Hey, I tried your code and it works fine for me. But after saving the model while loading it again I'm facing the problem.
tf.keras.models.load_model("tf_ner_model/tf_ner_2020_7_29.h5", custom_objects={"CRF": CRF})
I'm getting the below error:
TypeError: __init__() missing 1 required positional argument: 'num_classes'
Am I loading the model correctly?

Have you fixed this problem yet?

@luozhouyang
Copy link

hey guys, have a look to luozhouyang/keras-crf, which is a more elegant and convenient CRF built on tensorflow-addons.

@Tao2301230
Copy link

@SeanLee97 Hey, I tried your code and it works fine for me. But after saving the model while loading it again I'm facing the problem.
tf.keras.models.load_model("tf_ner_model/tf_ner_2020_7_29.h5", custom_objects={"CRF": CRF})
I'm getting the below error:
TypeError: __init__() missing 1 required positional argument: 'num_classes'
Am I loading the model correctly?

Have you fixed this problem yet?

try 'load_weights', instead of 'load_model' helps

@UrszulaCzerwinska
Copy link

Hi,
I just tested this solution with my Sequence classification model in keras but crf lowers my scores from 0.88 accuracy without crf to 0.84 with crf

I also tried luozhouyang/keras-crf but I am getting an error that is reported in issues with no answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests