The SULEC Corpus is a project managed by a group of
researchers from the Department of English Philology at the
The subsequent, all-embracing analysis of such data will allow us to perform investigations at different levels:
Phonological level: main difficulties found by these students when learning pronunciation (segmental and suprasegmental features), linguistic interferences, preferences for some specific model or linguistic variety.
Morphosyntatic level: word-order, concord problems, length and syntatic structures, acquisition of given constructions (negative forms, relative clauses, existential constructions), empty categories.
Lexical level: type and number of words used, frequencies of use, lexical collocations, "false friends".
Discourse level: organisation of the information, use of cohesive devices, communicative strategies.
In addition, we will also explore the pedagogical applications derived from our corpus, incorporating this information to the materials used for English language teaching (dictionaries, glossaries, grammars, also reference books). Likewise, we believe that the results of our analysis might have important impplications for the fields of Translation and the so called Computer Assisted Language Learning (CALL).
The aim of the project is the compilation of a large and solid corpus of real language, both spoken and written, produced by Spanish learners of English. Nowadays, corpora with all these features do not exist and ours would bring about a great number of subsequent works in the various different areas that are somehow related to the acquisition and the teaching of English, such as Translation and Constrative Linguistics.
Although many important linguistic scholars such as Chomsky do not believe in research based on corpora, corpus-based research has been used with great success in the study of English, leading to the creation of corpora such as the British National Corpus (BNC) or the International Corpus of English (ICE) This has had a great influence on the creation of corpora to study second language performance, and therefore researchers have put together data collections such as the
International Corpus of Learner English (ICLE), the Taiwanese Learner Corpus of English (TLCE) or the Japanese EFL Learner Corpus (JEFLL). We believe in the importance of basing our reseach on a corpus. By looking at real second language performance, we do not just base our research on simple theories and hypothesis. Therefore, we expect that this project will contain interesting data showing the performance of Spanish speakers of English, and that it will be succesfully applied to many different research purposes.