change compile_all to parallel #9

felixschurk · 2022-11-26T16:32:40Z

Hei,
this is more an enhancement than an issue.
When compiling all with ./calliope -p 2022 and there are quite some files in the directory it does take a long time.

My idea is to pass all the files into the GNU parallel command https://www.gnu.org/software/parallel/ which then would execute it on as many CPU's as the given machine has.

A downside is that I have currently not figured out how to stop if an file could not be compiled.
But this only gives one problem for the "bad" file, others will continue to be compiled.

I compared the timing with time for 59 tex files:
parallel :

real	6m27.384s
user	16m23.058s
sys	5m38.090s

sequentiell:

real	13m10.128s
user	10m54.348s
sys	1m53.366s

which would mean that it only took kind of half the time.

If you think that would be an useful enhancement I could create a pull request for it.

The text was updated successfully, but these errors were encountered:

sanjayankur31 · 2022-11-29T11:54:59Z

Thanks for this @felixschurk . Yes, it would certainly be an enhancement. Please feel free to open a PR and we can refine it before the next release.

I've thought of this before, but as you note, it can be tricky to ensure that it is done correctly. I think we'll have to do something on the lines of:

if parallel exists, use parallel compilation, otherwise fall back to single operations
parallel returns a non-zero exit status if any of its tasks fail, so we can use that to check if any task failed---but I haven't thought of how we'd figure out which particular task had failed

Finally, if we can generalise this logic as a separate function, it could perhaps also be used for other bits in the script that process multiple files---like the encryption/decryption bits.

What do you think?

MarkLeakos · 2023-01-23T15:53:21Z

From chatgpt:

There are a few ways to determine which particular task failed when using GNU parallel. One way is to run
parallel --joblog

which creates a log file that records the exit status and command of each task run. By looking at the log file, you can determine which command failed and its corresponding exit status.

You can also run:
parallel --tag
which adds the command's arguments as a prefix to the output. This allows you to easily identify the output of each task, and if a task fails, you can identify which command failed by looking at the output.

You can also use the
parallel --bar
option to give an overview of the progress of the commands, and this also indicates which command failed with a red X.

You can also use the
parallel--halt
option with a value, for example --halt 1 that stops parallel execution if any of the commands exit with a non-zero exit status.

If you are using the parallel command inside a shell script, you can use the
PIPESTATUS
variable to check the exit status of each command.
end of chatgpt.

I have no clue whether any of this would be helpful but I thought I'd put it here for you to decide.
Good luck.

sanjayankur31 · 2023-01-23T16:11:05Z

Thanks @MarkLeakos : unfortunately, chatGPT is not known for its accuracy, so I'd rather not depend on what it says when it comes to things like this (especially if does not provide references). man parallel seems quite exhaustive, so I'd expect the answer to be in there.

PS: #10 is on my list of things to do, I just have to find the time to work on it :)

felixschurk · 2023-01-24T08:55:03Z

Thank you @MarkLeakos, I did not knew before for what exactly I was searching :D But now with the -joblog there is an proper output of what parallel did.

The current PR #10 now produces an output, which can be checked for the failed documents.
I thought it is more desired that parallel continues to work on all files, and does not stop when one gives an error, since usually the documents should be independent.

The -bar I also added since, when there are quite some files to progress, that gives some overview.

MarkLeakos · 2023-01-24T22:49:08Z

You are welcome @felixschurk. Chatgpt is usually good to stimulate ideas. Mark

…

On Tue, Jan 24, 2023, 02:55 felixschurk ***@***.***> wrote: Thank you @MarkLeakos <https://github.com/MarkLeakos>, I did not knew before for what exactly I was searching :D But now with the -joblog there is an proper output of what parallel did. The current PR #10 <#10> now produces an output, which can be checked for the failed documents. I thought it is more desired that parallel continues to work on all files, and does not stop when one gives an error, since usually the documents should be independent. The -bar I also added since, when there are quite some files to progress, that gives some overview. — Reply to this email directly, view it on GitHub <#9 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOLSK3GXTJE76AJXOJDLXDDWT6KHDANCNFSM6AAAAAASMCE5TI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

sanjayankur31 added the enhancement label Nov 29, 2022

felixschurk mentioned this issue Dec 6, 2022

parallel compilation for documents #10

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change compile_all to parallel #9

change compile_all to parallel #9

felixschurk commented Nov 26, 2022

sanjayankur31 commented Nov 29, 2022

MarkLeakos commented Jan 23, 2023 •

edited

Loading

sanjayankur31 commented Jan 23, 2023

felixschurk commented Jan 24, 2023

MarkLeakos commented Jan 24, 2023 via email

change compile_all to parallel #9

change compile_all to parallel #9

Comments

felixschurk commented Nov 26, 2022

sanjayankur31 commented Nov 29, 2022

MarkLeakos commented Jan 23, 2023 • edited Loading

sanjayankur31 commented Jan 23, 2023

felixschurk commented Jan 24, 2023

MarkLeakos commented Jan 24, 2023 via email

MarkLeakos commented Jan 23, 2023 •

edited

Loading