Clean up bits and bobs #64

lizgzil · 2022-08-19T15:43:04Z

Notes from #52 (not addressed)

make sure the scripts run with different options for loading esco embeddings, or not. loading ojo-esco pre-mapped, or not.
maybe make the displacy NER viewer available from ExtractSkills
add multiskill flag to ExtractSkills output (i.e. if the skill originally came from a multiskill and was split or not). This was harder than I thought when I began to address it, so I gave up for now
for ExtractSkills - maybe some logger warnings if the input data isn't in the correct format, e.g. if you input a job advert which is a float or something weird like that
perhaps make toy example a bit nicer/cleaner, e.g. include a maths skill so that the output looks good (rather than matching maths with communication skills because its the closest available.
in skill_ner_mapper: something clever with which bert model is loaded, e.g. if you load esco embeddings, then the ojo embeddings should be found using the same model
in skill_ner_mapper: I think some cleaning of which variables are assigned to self, and which are outputted in the functions might need cleaning up. I think we might self. variables ineffectively
update what skill_ner_mapper does in a readme

lizgzil added the Cleaning/refactor label Aug 19, 2022

Provide feedback