Using
corpus analysis software to analyze specialized texts
1. What is corpus?
A
corpus (sometimes used in the plural form ‘corpora’) can be generally defined as
………. ‘A collection of
naturally-occurring texts in a computer-readable format which can be retrieved
and analyzed using corpus analysis software’ (Kennedy,1998; McEnery
& Wilson, 2001; O’Keeffe,A., McCarthy, M., & Carter, R. ,2007; Tuebert
& Cermakova, 2007)
2. Sources of language corpora
· Subscribe
to a large corpus provider such as the British National Corpus (BNC).
· Use
web concordancing.
· Compile
own corpora and analyze
data using analysis software
§ Antconc
(for monolingual corpus)
§ Wordsmith
(for monolingual corpus)
§ Paraconc
(for multilingual corpus)
3. Designing a specialized corpus
(based on Bowker and Pearson 2002)
·
Corpus size
§ There
are no fixed rules; depending
on research purposes, availability of data and time.
·
Text extracts vs. full text
§ Depends
on the aim of corpus compilation.
·
Number of texts
§ Depends
on your research focus.
· Medium
§ Can
be spoken or written
texts or mixed, it depends on research questions.
· Subject
and text type
§ Should
mainly focus on the specialized
text under investigation.
§ Text
type within a specialized subject field may vary from technical to popular texts.
· Other
considerations
§ Authorship:
Texts written by experts
in a field tend to present more reliable.
§ Language:
Specialized texts can be stored
and retrieved in the form of monolingual, comparable, or parallel
corpora.
§ Publication
date: Texts should come from recent publications unless queries are
made in relation to particular period of time.
4. Sources of specialized texts
·
Printed
materials software
·
Word
document texts
·
CD-ROMs
·
Texts
on the web
·
Online
database
5. Getting started with Antconc
·
Download
the latest version of Antconc.
·
Creating
a specialized corpus profile (adapted
from Bowker and Pearson 2002:72)
A sample profile
Size
|
56,812
words
|
Source
of corpus data
|
From
the internet (www.voanews.com)
|
Number
of texts
|
70
texts
|
Medium
|
written
|
Subject
|
News
about South Korea
|
Text
types
|
News
article
|
Authorship
|
Journalist
|
Language
|
Texts
written in English mostly by native speaker
|
publication
|
Recent
text (retrieved in September 2017)
|
· Doing small-scaled research on your own
specialized corpora.
Using corpora to do research in ESP
§ To
identify frequent words or clusters in a specialized corpus.
§ To
identify key words in a specialized corpus in comparison with a general corpus
for syllabus design, materials development, or terminological studies.
§ To
examine language patterning and phraseology of words in a specialized text.
§ To
examine meaning of specialized vocabulary.
ไม่มีความคิดเห็น:
แสดงความคิดเห็น