Thanks for the guidelines. Actually, I am still working on a word list but there is still a lot of "waste" in the corpus and it takes time. That was started to provide a basis for a spellchecker. I have a relatively clean file but very incomplete and almost exclusively dialect biased. But it could be good for a start. Anyway, I am not in a hurry so things will be done progressively.