On its most basic level, most textual analysis methods work by counting and classifying words or phrases within texts. Usually, this involves comparing these counts of classifications between a few or many documents. Some of these tools and methods are more advanced than others, but even the most complicated methods often first count words.
A few tips:
Why use computers to assist you with text analysis?
You are doing textual analysis any time you look at a text and pull out a deeper meaning. Computer-assisted textual analysis just means that you use some type of computing to help.
We use computers to help with text analysis because they can help us spot trends or themes that we might miss otherwise.
Computers can help you:
Tip: If you’ve got a question that can be answered by a small number of texts that you can realistically and easily read, it might not be the best question for computer-assisted textual analysis. Many computer-assisted textual analysis methods require a fairly large corpus in order to produce results that can reliably be interpreted. Results produced from smaller collections of texts, even if they look meaningful, are likely to be misleading.
Tip: Textual analysis software is changing very quickly. There is probably a new tool out there today that wasn’t there a few months or years ago. There are several internet-based tools that let you perform many of the analyses mentioned above - and more! Always be conscious of when the tools and techniques you use were published/shared, and look for newer versions. Things more than a few years old are likely to be out-of-date compared to cutting-edge recent advances.
Text data that you gather from databases or websites often needs to be prepared in specific ways before it can be analyzed. This isn't the most glamorous part of digital scholarship, but it is a crucial first step. There is no one correct way to prepare (often termed 'pre-process') your text data either, although some preparation steps are so common that they are almost ubiquitous, while others are only appropriate in certain specific circumstances.
Why would you need to prepare a corpus? Most textual analysis projects rely on plain-text digital versions of documents, which are easily read and analyzed by a computer. These documents need to be relatively clean of spelling errors, clearly labeled, and all collected in one place for the results to be accurate. Computers are also very sensitive to capitalization, punctuation, and other kinds of formatting. This usually (but not always) needs to be removed before computers can help read texts.
Use the tabs at the top left of this page to explore the most common steps of preparing a digital corpus: finding, cleaning, and analyzing.