In most cases, you're going to need to do some work on your texts, whether that's scanning them into Optical Character Recognition software or fixing up common OCR errors.
As you work, be sure to save multiple versions of your texts--including the original version! And take careful records of your texts and where you found them, along with any metadata, on a separate spreadsheet.
Why? This search will save you the effort to search individually for all the numbers, aka 0, 1, 2, 3... 326, 327, etc. Page numbers appear in almost all texts, but tell us little about the text.
Remove all the uppercase words
Why? Many texts (particularly from Internet Archive) include the title of the book and/or of the chapter on EVERY SINGLE PAGE. This repetition can skew your results; fortunately, these titles are often in all-caps, so you can eliminate them with regular expressions.
This particular regular expression deletes all-caps words of two letters or more. (It will leave I and A; these are common stopwords anyway, but you can keep them if you're interested in, for example, first-person use in your texts.)
Don't forget to check "match case"! If you leave the box unchecked, it will try to delete every single word longer than two letters from your corpus (erasing your data and crashing the program).