Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mantis2trac.py dies on Unicode errors #4

Open
Aeon opened this issue Mar 28, 2017 · 3 comments
Open

mantis2trac.py dies on Unicode errors #4

Aeon opened this issue Mar 28, 2017 · 3 comments

Comments

@Aeon
Copy link
Member

Aeon commented Mar 28, 2017

On 4-Aug-2009, at 3:23pm, tdmurphy4 wrote (Trac issue 5617):

I am trying to run mantis2trac.py on a Mantis v1.1.8 server running under Mysql 5.x. It imports 40 tickets, than on ticket 41 it bombs out with:

Traceback (most recent call last):
  File "./mantis2trac.py", line 943, in <module>
    main()
  File "./mantis2trac.py", line 940, in main
    convert(MANTIS_DB, MANTIS_HOST, MANTIS_USER, MANTIS_PASSWORD, TRAC_ENV, TRAC_CLEAN)
  File "./mantis2trac.py", line 680, in convert
    trac.addTicket(**ticket)
  File "./mantis2trac.py", line 386, in addTicket
    desc = description.encode('utf-8')
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa3 in position 244: unexpected code byte

I looked at ticket 41. It has the UK pound symbol (£) in it. How do I get the script to fix this properly? I'd like to import all of the tickets in but this is a really big showstopper.

Thanks!

@Aeon Aeon self-assigned this Mar 28, 2017
@Aeon
Copy link
Member Author

Aeon commented Mar 28, 2017

On 4-Aug-2009, at 4:47pm, tdmurphy4 wrote (Trac issue 5617#comment:1):

I tried converting the Mantis database from latin1 to utf8 with the instructions given at http://en.gentoo-wiki.com/wiki/TIP_Convert_latin1_to_UTF-8_in_MySQL but it makes no difference. The python script completely barfs when it hits a '£'. Is there a way around this?

@Aeon
Copy link
Member Author

Aeon commented Mar 28, 2017

On 5-Aug-2009, at 10:12am, tdmurphy4 wrote (Trac issue 5617#comment:2):

I managed to fix it. But you might want to add notes to the mantis2trac.py page that if Mysql has the database in latin1 format, and Mantis merrily encodes things with UTF8 anyway, that dumping and adding the database per instructions here: http://paulkortman.com/2009/07/24/mysql-latin1-to-utf8-conversion/ fixes things.

I basically dumped with: --default-character-set=utf8 then ran sed on the dump to change occurrances of utf8 to latin1 (sed -i 's/utf8/latin1/g' dump.sql), then imported it into a new, empty database I created and ran the mantis import script on that.

@Aeon
Copy link
Member Author

Aeon commented Mar 28, 2017

On 3-May-2010, at 10:35am, anonymous wrote (Trac issue 5617#comment:3):

I too experience this problem, with the Mantis database's table all encoded using the utf8_general_ci encoding. I've tried to just remove the calls to .encode('utf-8'), but this results in another error altogether:

inserting ticket 91 -- "speciale teken &#966; als ? afgedrukt"
Traceback (most recent call last):
  File "c:\python26\scripts\mantis2trac.py", line 946, in <module>
    main()
  File "c:\python26\scripts\mantis2trac.py", line 943, in main
    convert(MANTIS_DB, MANTIS_HOST, MANTIS_USER, MANTIS_PASSWORD, TRAC_ENV, TRAC_CLEAN)
  File "c:\python26\scripts\mantis2trac.py", line 683, in convert
    trac.addTicket(**ticket)
  File "c:\python26\scripts\mantis2trac.py", line 404, in addTicket
    summary, desc, keywords))
  File "build\bdist.win32\egg\trac\db\util.py", line 122, in execute
  File "build\bdist.win32\egg\trac\db\sqlite_backend.py", line 78, in execute
  File "build\bdist.win32\egg\trac\db\sqlite_backend.py", line 56, in execute
  File "build\bdist.win32\egg\trac\db\sqlite_backend.py", line 48, in _rollback_on_error
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (l
ike text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

This occurs on special characters like ë and á and ì. Is there any quick fix to be done in the script to fix this? I'm no character encoding expert, and I've already spent far too much time on this problem.

@Aeon Aeon removed their assignment Mar 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant