Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDO Thread refactor #56

Open
alex-brinkman opened this issue Oct 26, 2022 · 1 comment
Open

SDO Thread refactor #56

alex-brinkman opened this issue Oct 26, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@alex-brinkman
Copy link
Collaborator

    I spent some time looking and thinking about this function and, although this works, I think we are overburdening the SDO thread with error handling. I think it would be cleaner to have a separate thread for the latter. It seems to me that handling SDO read/write requests is relatively independent of handling errors such as the EMCYs from the Elmos.

A couple of points that make me think this way:

  • Ideally, sdo_thread_cond should be an indication that the predicate (queue_is_empty(&self->jsd_sdo_req_cirq), i.e. state of queue) might have changed. This is the case when we signal from jsd_sdo_push_async_request, but it is not the case when we signal from the EGD driver in the fault state (indirectly through jsd_read).
  • When the signal comes from jsd_sdo_push_request we go over the error handling code before we can get to the point of servicing the SDO request queue, even if there are no errors present.

I am imagining something like this:

void* sdo_thread_loop(void* void_data)
{
  while (true)
  {
    pthread_mutex_lock(&self->jsd_sdo_req_cirq.mutex);

    while (queue_is_empty(&self->jsd_sdo_req_cirq))
      pthread_cond_wait(&self->sdo_thread_cond, &self->jsd_sdo_req_cirq.mutex);

    jsd_sdo_req_t req = queue_pop(&self->jsd_sdo_req_cirq);

    pthread_mutex_unlock(&self->jsd_sdo_req_cirq.mutex);

    if (self->sdo_join_flag) {
      return NULL;
    }

    // Submit the SDO request with ecx_SDOwrite/ecx_SDOread

    // Queue the response.
    jsd_sdo_req_cirq_push(&self->jsd_sdo_res_cirq, req);
  }
}

We can prescind from pthread_cond_timedwait above because even if we miss a signal while servicing the one received, the queue will not be empty after that second signal, so when we get to the while loop on the predicate, we will skip the wait.

Similarly for error handling:

void* error_thread_loop(void* void_data)
{
  while (true)
  {
    pthread_mutex_lock(&self->error_handling_mutex);

    while (!self->errors_available)
      // Set ts based on how frequent we want to finish servicing the errors beyond MAX_ERRORS_PER_LOOP
      pthread_cond_timedwait(&self->error_handling_cond, &self->error_handling_mutex, &ts);

    pthread_mutex_unlock(&self->error_handling_mutex);

    if (self->error_handling_join_flag) {
      return NULL;
    }

    // Parse through errors with ecx_mbxreceive, ecx_iserror, and ecx_poperror, and queue them with jsd_error_cirq_push(&self->slave_errors[err.Slave], err)

    if (handled_errors < MAX_ERRORS_PER_LOOP) {
      // We are sure we handled all errors present.
      pthread_mutex_lock(&self->error_handling_mutex);
      self->errors_available =  false;
      pthread_mutex_unlock(&self->error_handling_mutex);
    }
  }
}

Above, if ecx_mbxreceive is real-time safe, we could have a single pthread_mutex_unlock(&self->error_handling_mutex) at the end of the outer while loop (and one before the return NULL).

And in egd.c:
...

if(state->pub.actual_state_machine_state == JSD_EGD_STATE_MACHINE_STATE_FAULT){
  pthread_mutex_lock(&self->error_handling_mutex);
  self->errors_available =  true;
  pthread_cond_signal(&self->error_handling_cond);
  pthread_mutex_unlock(&self->error_handling_mutex);

...

In my opinion having the functions separate is simpler to read/understand and probably maintain.

Originally posted by @d-loret in #46 (comment)

@alex-brinkman
Copy link
Collaborator Author

I agree the current implementation is messy. I also thought of another third pthread for error handling but felt the risk of locking was too high so kept the ball-of-mud since it performs well.

I'd accept contribution here so long as it tests out well on hardware.

@alex-brinkman alex-brinkman added the enhancement New feature or request label Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants