The Corpus linguistics refers to a body of text. This text can be written or spoken or a combination of both. Corpora ( plural of corpus) can be based on brief text on a narrow topic or can run into millions of words such as BNC ( British National Corpus, a 100- million words of British English) or Cobuild Corpus.
To access, or make use of a corpus one should use a concordancer to look at linguistic patterns. A concordancer is a software that show instances of words in a body of text. In addition, it allows to show collocations and frequencies of words. This approach can be called Key word in context ( KWIK). Now web-based concordancers are being increasingly available, such as Cobuild and lextutor.
The following screenshots are from the Cobuild web-based Corpus
1. Write the word you want to query in the box, then click “show conc”
2. A pop-up window will appear with the instances of the word
3. To use in the classroom, teacher should turn students’ attention to the collocations and usage of the word in authentic language ( since the word is retained from authentic text and not text made specifically for esl/efl). Students then can derive grammatical rules (eg. modal verbs, indefinite pronouns..) and notice how certain vocabulary is used in authentic context. They can deduce what a certain vocabulary collocates with.
4. The teacher can also use a fill-in-the-blank activity where the word in query is omitted. If internet connection is not available or there are no computers in the classroom, the teacher can distribute them as printout. Note that the teacher should spend time preparing this concordance before presenting it in the classroom to ensure that it targets the intended language use.

Using corpora in the classroom involves making use of concordance software to analyze a corpora ( or web-based concordance such as the above example) and spot patterns and differences in language usage. For instance, students can use corpus linguistics with the aid of a concordance to make error corrections to their writing, or the teacher can show students a certain syntactical or lexical usage for students to induce the rule ( inductive learning), called data-driven learning since it is based on a data analysis that results in linguistic learning. ( check out the father of data-driven learning website, Tim Jones). For more on the idea of data-driven learning click here. Of course, using a concordance and corpora is not easy for students to handle so it is imperative that students practice extensively on deriving or inducing rules from linguistic patterns, or even correct their linguistic and writing error based on a written corpus.
Again, it is important to note that data-drive learning demands extensive practice before employing it as an approach. The role of the teacher becomes that of a manager, orienteer, and observer and the role of the student changes to a researcher of language.
Why use this approach instead of traditional grammar and lexical instruction?
  • It exposes the language learner to authentic language instead of rather fabricated ESL text
  • It changes the role of the language learner from a mere receptive individual into a language researcher ( note that this approach might not work as expected especially with young learners).
  • It ensures a learner-centered classroom without diminishing the role of the teacher
  • It encourages learner autonomy with regard to errors correction ( will be discussed in my next post)
More posts will also discuss more on concordancing, data-driven learning, and corpora. How a teacher can collect a certain corpus for a certain learning context, how a teacher can analyze his/her learners’ linguistic output, such as writing, called learner error analysis,and how to use corpora in more activities in the classroom.
Now, I leave you with some links to concordance software, including web-based, that you can use and play around:
  • antconc Lawrence Anthony’s free concordance software that you can download
  • monoconc pro commercial concordance
  • concordance commercial concordance software that I use
  • lextutor a free web-based concordance
  • Cobuild free web-based concordance and corpus
Next post will discuss how to integrate a concordance in word processor to result in error noticing and learner autonomy.
If you have an queries , need more info, or just want to post feedback please post a comment. Your comments are highly welcome :)