Skip to content

jordimas/tmx-to-text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Actions status PyPI version PyPI downloads

Introduction

tmx-to-text allows convert TMX files into plain text and to get information.

This tool be used for example to:

  • Extract translations memories into text file for spell checking or post-editing
  • Extract corpuses into text files for traning machine learning similar

The following command will extract the Catalan and Italian texts out of the TMX file:

tmx-to-text convert -f ca-it.tmx -s ca -t it

Running the application with -h shows the options avaiable for the info and convert commands.

usage: tmx-to-text info [-h] -f TMX_FILE

optional arguments:
  -h, --help   show this help message and exit
  -f TMX_FILE  TMX file to show info


usage: tmx-to-text convert [-h] -f TMX_FILE -s SOURCE_LANG -t TARGET_LANG [-p PREFIX] [-d] [-x] [-a]

optional arguments:
  -h, --help            show this help message and exit
  -f TMX_FILE           TMX file to convert
  -s SOURCE_LANG, --source_lang SOURCE_LANG
                        Source language to export
  -t TARGET_LANG, --target_lang TARGET_LANG
                        Target language to export
  -p PREFIX, --prefix PREFIX
                        Filename prefix used in the generated text files
  -d, --debug           Debug memory and execution time
  -x, --nodup_source    Remove duplicates based on source
  -a, --nodup_target    Remove duplicates based on target