Sometimes you'll be starting with a ready-made corpus of texts you want to analyze, but in other cases, you're going to have to spend some time tracking down the texts you want to use.
Your top choice in finding texts will be pre-digitized texts which have been spell-checked or retyped manually by someone. Your second choice will be texts which have already been OCRed. Your third choice are texts which require OCR. If none of your texts have been digitized, that's another step, but not one that we'll cover here.