Santa Barbara Corpus of Spoken American English

Parts 1-4 of the Santa Barbara Corpus of Spoken American English (SBCSAE) are now available, for a total of approximately 249,000 words. The Santa Barbara Corpus includes transcriptions, audio, and timestamps which correlate transcription and audio at the level of individual intonation units.

Access
Description
Contents and Summaries
Citation
Recordings
Acknowledgements
Contact

Access

All transcriptions in the Santa Barbara Corpus parts 1-4 can be dowloaded for free by clicking here. Metadata is available here.

To access individual conversations and other discourse segments in the Santa Barbara Corpus, you may select the audio file and transcription you wish to download by consulting the Contents and Summaries.

To download the audio files in WAV (recommended) or MP3 format, do the following:

  1. Select the transcription you want (e.g. SBC001 Actual Blacksmithing) under the listing of Contents and Summaries
  2. Right-click on the audio format you want (WAV or MP3)
  3. Select "Save link as...". This should save the file to your computer in the format you have selected.

Alternatively, you can do the following:

  1. Select a transcription (e.g. SBC001 Actual Blacksmithing) under the listing of Contents and Summaries
  2. Click on the audio format you want (WAV or MP3)
  3. The sound will start playing on your computer, and you will see a bar on your screen. Wait a little while for the file to download to your computer (during which time you can listen to the streaming audio)
  4. Click on the downward-pointing arrow at the right edge of the bar, and choose "Save as source". This should save the file to your computer in the format you have selected.

Although it is now available for free on-line (see above), the Santa Barbara Corpus of Spoken American English can still be purchased on CD and DVD from the Linguistic Data Consortium, at the following web pages:

Part 1: LDC Catalog No. LDC2000S85
Part 2: LDC Catalog No. LDC2003S06
Part 3: LDC Catalog No. LDC2004S10
Part 4: LDC Catalog No. LDC2005S25

A version of the Santa Barbara Corpus transcriptions in CHAT format, including metadata, is available for download here; CHAT transcriptions of individual conversations are also available here under Contents and Summaries.

     The audio files for the Santa Barbara Corpus can also be downloaded from TalkBank.org, in either MP3 or WAV file format, from the following locations:
     For MP3 files: https://talkbank.org/media/CABank/SBCSAE/
     For WAV files: https://talkbank.org/media/CABank/SBCSAE/0wave/

SBCSAE by John W. Du Bois is licensed under a Creative Commons Attribution-No Derivative Works 3.0 United States License.

Description

The Santa Barbara Corpus of Spoken American English is based on a large body of recordings of naturally occurring spoken interaction from all over the United States. The Santa Barbara Corpus represents a wide variety of people of different regional origins, ages, occupations, genders, and ethnic and social backgrounds. The predominant form of language use represented is face-to-face conversation, but the corpus also documents many other ways that that people use language in their everyday lives: telephone conversations, card games, food preparation, on-the-job talk, classroom lectures, sermons, story-telling, town hall meetings, tour-guide spiels, and more.

The Santa Barbara Corpus was compiled by researchers in the Linguistics Department of the University of California, Santa Barbara. The Director of the Santa Barbara Corpus is John W. Du Bois, working with Associate Editors Wallace L. Chafe and Sandra A. Thompson (all of UC Santa Barbara), and Charles Meyer (UMass, Boston). For the publication of Parts 3 and 4, the authors are John W. Du Bois and Robert Englebretson.

The Santa Barbara Corpus of Spoken American English also forms part of the International Corpus of English (ICE). The Santa Barbara Corpus provides the main source of data for the spontaneous spoken portions of the American component of the International Corpus of English. In order to meet the specific design specifications of the International Corpus of English (allowing comparison between American and other national varieties of English), the Santa Barbara Corpus data have been supplemented by additional materials in certain genres (e.g. read speech), filling out the American component of ICE.

The Research Centre for Empirical Pragmatics (RCEP) at Bonn Applied English Linguistics (BAEL) maintains a bibliography of works which make use of the Santa Barbara Corpus.

Contents & Summaries

Part 1 Part 2 Part 3 Part 4
SBC001 Actual Blacksmithing SBC015 Deadly
Diseases
SBC031 Tastes Very
Special
SBC047 On the Lot
SBC002 Lambada SBC016 Tape Deck SBC032 Handshakes
All Around
SBC048 Mickey Mouse
Watch
SBC003 Conceptual Pesticides SBC017 Wonderful
Abstract Notions
SBC033 Guilt SBC049 Noise Pollution
SBC004 Raging
Bureaucracy
SBC018 Vet Morning SBC034 What Time
is it Now?
SBC050 Just Wanna
Hang
SBC005 A Book
about Death
SBC019 Doesn't Work
in this Household
SBC035 Hold My
Breath
SBC051 New Yorkers Anonymous
SBC006 Cuz SBC020 God's Love SBC036 Judgmental
on People
SBC052 Oh You Need
a Breadbox
SBC007 A Tree’s Life SBC021 Fear SBC037 Very Good
Tamales
SBC053 I Will Appeal
SBC008 Tell the
Jury That
SBC022 Runway
Heading
SBC038 Good
Strong Dam
SBC054 'That's Good',
Said Tiger
SBC009 Zero Equals
Zero
SBC023 Howard's End SBC039 Pretty Busy
Bird
SBC055 The Mama
of Dada
SBC010 Letter of
Concern
SBC024 Risk SBC040 Beaten on
a Regular Basis
SBC056 What is a
Brand Inspection?
SBC011 This
Retirement Bit
SBC025 The Egg
which Luther Hatched
SBC041 X Units
of Insulin
SBC057 Throw Me
SBC012 American
Democracy is Dying
SBC026 Hundred
Million Dollars
SBC042 Stay Out of It SBC058 Swingin' Kid
SBC013 Appease
the Monster
SBC027 Atoms
Hanging Out
SBC043 Try a
Couple Spoonfuls
SBC059 You Baked
SBC014 Bank
Products
SBC028 Hey
Cutie Pie
SBC044 He Knows SBC060 Shaggy Dog Story
  SBC029 Ancient
Furnace
SBC045 The Classic
Hooker
 
  SBC030 Vision SBC046 Flumpity-Bump
Down the Hill
 

 

SBC001 Actual Blacksmithing

This is a conversation recorded in rural Hardin, Montana. Mae Lynne is a student of equine science, and is the main speaker. She is telling Lenore (a visitor and near stranger) about her studies. Doris, Mae Lynne's mother, is doing housework, but joins the conversation near the end to discuss friends of their family.

Audio: WAV MP3 Text: TRN CHAT

SBC002 Lambada

After-dinner conversation among four friends in San Francisco, California. Participants are in their late twenties or early thirties. Harold and Jamie are a married couple, Miles is a doctor, and Pete is a graduate student from Southern California.

Audio: WAV MP3 Text: TRN CHAT

SBC003 Conceptual Pesticides

A conversation among three friends who are preparing dinner together, recorded in Southern California. Roy and Marilyn are a married couple, and Pete is a friend visiting from out of town. All participants are in their early thirties.

Audio: WAV MP3 Text: TRN CHAT

SBC004 Raging Bureaucracy

Family conversation recorded in Santa Fe, New Mexico. The primary participants are three sisters all in their twenties.

Audio: WAV MP3 Text: TRN CHAT

SBC005 A Book About Death

A conversation between a couple who are lying in bed, recorded in Santa Barbara, California.

Audio: WAV MP3 Text: TRN CHAT

SBC006 Cuz

A very lively interaction between two female cousins in their mid-thirties, recorded in Los Angeles, California.

Audio: WAV MP3 Text: TRN CHAT

SBC007 A Tree's Life

Late-night conversation between two sisters, recorded in Montana.

Audio: WAV MP3 Text: TRN CHAT

SBC008 Tell the Jury that

Task related interaction--an attorney preparing two witnesses to testify in a criminal trial. Recorded in San Francisco, California. Rebecca is a lawyer, June and Rickie are the witnesses, and Arnold is Rickie's husband.

Audio: WAV MP3 Text: TRN CHAT

SBC009 Zero Equals Zero

Task-related talk, a teenage couple recorded in Mobile, Alabama. Kathy is helping her boyfriend Nathan prepare for a math test.

Audio: WAV MP3 Text: TRN CHAT

SBC010 Letter of Concerns

A business conversation recorded in New Mexico. Brad and Phil are board members of a local arts society. Phil wants to talk business, while Brad keeps trying to leave to pick up his wife who's waiting for him at a bookstore.

Audio: WAV MP3 Text: TRN CHAT

SBC011 This Retirement Bit

A conversation among three friends before lunch, recorded in Tucson, Arizona. All three participants are retired women; Samantha (Sam) is 72, Doris is 83, and Angela is 90.

Audio: WAV MP3 Text: TRN CHAT

SBC012 American Democracy is Dying

University lecture, recorded in Riverside, California. This is a Chicano Studies class; the professor is the primary participant, although it is a small, summer school class, and nine members of the class occasionally interact.

Audio: WAV MP3 Text: TRN CHAT

SBC013 Appease the Monster

This is a family conversation/birthday party, recorded in Fort Wayne, Indiana. The five participants are family members: Kendra (the birthday girl) and Kevin are siblings, Ken and Marci are their parents, and Wendy is Kevin's wife. This segment is highly interactional and contains a lot of overlap.

Audio: WAV MP3 Text: TRN CHAT

SBC014 Bank Products

Task related talk—this is a loan officers meeting, recorded in a bank in a small town in rural southern Illinois. Joe and Fred are loan officers working for the bank. Jim is the president of the bank, and Kurt is a board member.

Audio: WAV MP3 Text: TRN CHAT

SBC015 Deadly Diseases

A conversation among three friends, recorded in Los Angeles, California. Ken and Joanne are a couple, and Lenore is a friend of theirs.

Audio: WAV MP3 Text: TRN CHAT

SBC016 Tapedeck

A sales encounter, recorded in an audio store in Santa Barbara. Tammy is planning to buy a new tape deck. Brad, a salesman at the audio store, is discussing various tape decks which he is trying to sell her.

Audio: WAV MP3 Text: TRN CHAT

SBC017 Wonderful Abstract Notions

A conversation between two male friends, recorded in Southern California.

Audio: WAV MP3 Text: TRN CHAT

SBC018 Vet Morning

A task-related interaction recorded in a veterinarian office near Madison, Wisconsin. All five participants work in the office, some as secretaries and assistants and some as veterinarians.

Audio: WAV MP3 Text: TRN CHAT

SBC019 Doesn't Work in this Household

A family conversation, recorded in Michigan. Frank and Jan (a married couple) are talking with Ron--Jan's brother who is visiting from California. Brett and Melissa are Frank and Jan's junior-high-age children, who are doing homework and also taking part in the conversation.

Audio: WAV MP3 Text: TRN CHAT

SBC020 God's Love

A segment from a sermon/lecture recorded at a small conference near Chicago, Illinois. The speaker is a pastor in his mid seventies.

Audio: WAV MP3 Text: TRN CHAT

SBC021 Fear

A segment from a rather lively sermon recorded in Boston, Massachusetts.

Audio: WAV MP3 Text: TRN CHAT

SBC022 Runway Heading

Task-related interaction, recorded in an air traffic control tower in Portland, Oregon. Lance is training to be an air traffic controller, and has just finished working a shift. Randy, an experienced controller, is giving Lance feedback/briefing on his performance on that shift.

Audio: WAV MP3 Text: TRN CHAT

SBC023 Howard's End

A segment from a book discussion group, recorded in Topeka, Kansas. The eleven participants are all women between the ages of 46 and 85.

Audio: WAV MP3 Text: TRN CHAT

SBC024 Risk

This segment consists of game-playing and game-teaching on a computer, and was recorded near Cape Cod, Massachusetts. Jennifer and Dan are a couple in their early twenties.

Audio: WAV MP3 Text: TRN CHAT

SBC025 The Egg which Luther Hatched

This is a segment from a lecture on the history and theology of Martin Luther, part of an evening class held at a church, recorded in Delaware.

Audio: WAV MP3 Text: TRN CHAT

SBC026 Hundred Million Dollars

This is a city meeting, recorded in Chicago, Illinois. City officials interact with the public about a government grant which is being applied for, to fund community development. The city can only apply once, so are soliciting applications from various organizations and will submit the one they judge as best.

Audio: WAV MP3 Text: TRN CHAT

SBC027 Atoms Hanging Out

An entertaining science lecture and demonstration, recorded at a large public science museum in Chicago, Illinois.

Audio: WAV MP3 Text: TRN CHAT

SBC028 Hey Cutie Pie

A very intimate long-distance telephone conversation between a romantic couple in their early twenties, which took place between Pennsylvania and California.

Audio: WAV MP3 Text: TRN CHAT

SBC029 Ancient Furnace

This is a business conversation recorded in Northern California between Seth and Larry, who are meeting for the first time. Seth works as an engineer who designs, installs, and sells heating and air conditioning units. Larry has invited him to his home to give him an estimate.

Audio: WAV MP3 Text: TRN CHAT

SBC030 Vision

A segment from a sermon, recorded at a large Baptist church in Chicago, Illinois.

Audio: WAV MP3 Text: TRN CHAT

SBC031 Tastes Very Special

Face-to-face conversation recorded in a restaurant in Pullman, Washington. Sherry and Beth are sisters (in their late twenties), and Rosemary is their mother. The participants discuss what to order for lunch, interact with the waitress (Jamie) and engage in talk about family and friends while waiting for their food.

Audio: WAV MP3 Text: TRN CHAT

SBC032 Handshakes All Around

A face-to-face conversation that takes place at an outdoor neighborhood 'block party' in Santa Fe, New Mexico. The three main participants are neighbors, age 60 and upward, all of whom happen to be named Tom. Discussion centers on life histories, World War II experiences, and neighborhood gossip. The three are briefly joined by Tucker (the daughter of Tom_1), and Elaine (the wife of Tom_3).

Audio: WAV MP3 Text: TRN CHAT

SBC033 Guilt

A lively family argument/discussion recorded at a vacation home in Falmouth, Massachusetts. There are eight participants, all relatives or close friends. Discussion centers around a disagreement Jennifer (age 23) is having with her mother (Lisbeth).

Audio: WAV MP3 Text: TRN CHAT

SBC034 What Time is it Now?

A late-night face-to-face conversation recorded in Northampton, Massachusetts. Participants are a married couple (Karen and Scott) in their early twenties. Karen has just returned home from work, and the two are talking while winding down for the evening.

Audio: WAV MP3 Text: TRN CHAT

SBC035 Hold my Breath

Lively family argument/discussion recorded in the kitchen of a family home in Pittsburgh, Pennsylvania.

Audio: WAV MP3 Text: TRN CHAT

SBC036 Judgmental on People

Face-to-face conversation recorded in Albuquerque, New Mexico. There are three participants and a baby. Lisa and Kevin are siblings, Marie (the baby's mother) is a friend of Lisa's. Much of the speech event focuses on interaction with, and talk about, the baby, as well as gossip about friends and co-workers.

Audio: WAV MP3 Text: TRN CHAT

SBC037 Very Good Tamales

Informal, task-related (cooking) talk recorded in the kitchen of a family home in Corpus Christi, Texas. A family is making tamales. Main participants are Julia (an 80-year-old woman), her daughter (Dolores), and grandson (Shane). They are briefly joined by Kate (Shane's sister) who is watching TV in another room. The segment contains occasional codeswitching (English/Spanish).

Audio: WAV MP3 Text: TRN CHAT

SBC038 Good Strong Dam

This segment is part of a tour of Hoover Dam, on the Nevada-Arizona border. The presentation is highly practiced. The main speaker also answers audience questions.

Audio: WAV MP3 Text: TRN CHAT

SBC039 Pretty Busy Bird

Task-related talk, a training meeting recorded at an aquarium in Chicago, Illinois.

Audio: WAV MP3 Text: TRN CHAT

SBC040 Beaten on a Regular Basis

Scripted tour of the Kentucky Horse Park / Museum. Presenter also addresses questions from the audience.

Audio: WAV MP3 Text: TRN CHAT

SBC041 X Units of Insulin

Medical interaction recorded in Southern California. A patient (Paige) is consulting with her dietician (Kristen) regarding management of diabetes.

Audio: WAV MP3 Text: TRN CHAT

SBC042 Stay out of It

Family argument and task-related talk, recorded in Pasco, Washington. The recording begins in a car, and moves to the kitchen of a family home. Main participants are three teenage sisters (Sabrina, Kendra, and Marlena), their mother (Kitty), and step-father (Curt). A friend of Sabrina's (Gemini) is also present. The dispute centers around Kitty's belief that Kendra stayed the night at a friend's house without permission, something which Kendra denies having done. Argument and shouting is interspersed with Saturday-morning housekeeping chores such as doing dishes and laundry.

Audio: WAV MP3 Text: TRN CHAT

SBC043 Try a Couple Spoonfuls

Face-to-face conversation recorded in the living room of a private home in Boise, Idaho, between Alice (a nurse, age 49) and her daughter Annette (a student and bank employee, age 24). Topics center mostly on their work day, as well as mutual acquaintances.

Audio: WAV MP3 Text: TRN CHAT

SBC044 He Knows

Face-to-face conversation recorded in the living room of a private home in Milwaukee, Wisconsin. Two friends (Cam and Lajuan) are talking about their families and friends, and their own experiences as gay men.

Audio: WAV MP3 Text: TRN CHAT

SBC045 The Classic Hooker

Face-to-face conversation recorded in the living room of an apartment in Milwaukee, Wisconsin. Two friends (Corinna and Patrick) are talking and watching TV. Topics are at times rather raunchy.

Audio: WAV MP3 Text: TRN CHAT

SBC046 Flumpity-Bump Down the Hill

Medical interaction, recorded in Shreveport, Louisiana. A patient (Darren) is consulting with his orthopedist (Reed) regarding a knee injury from a recent skiing accident.

Audio: WAV MP3 Text: TRN CHAT

SBC047 On the Lot

Face-to-face conversation between two cousins (Fred and Richard) in their early thirties, recorded in a private home in east Los Angeles, California. Topics include Richard's new job selling cars, Fred's frustration with factory work, and Richard's recent breakup with his girlfriend.

Audio: WAV MP3 Text: TRN CHAT

SBC048 Mickey Mouse Watch

Christmas morning traditions and gift-exchange among family members, recorded in Fresno, California. Tim and Lea are a couple in their late fifties, Judy is their daughter, and Dan is Judy's boyfriend.

Audio: WAV MP3 Text: TRN CHAT

SBC049 Noise Pollution

Face-to-face conversation recorded at an outdoor family birthday party near Boston, Massachusetts. There are ten speakers, all related. Four siblings in their mid thirties to mid forties: Dan, Al, Lucy, and Annette. Allen (Sr.), age 76, is their father. Al and Annette are twins. Linda is Al's wife, John is Annette's husband. Dave and Jane are Al and Linda's children. Glen is Lucy's son. Topics center primarily on recent renovations to Lucy's home.

Audio: WAV MP3 Text: TRN CHAT

SBC050 Just Wanna Hang

Face-to-face conversation among four roommates, recorded in a shared apartment in Burlington, Vermont. Speakers are all students at the University of Vermont, women ages 20-21. Speakers engage in small-talk, make plans for the evening, and discuss household matters.

Audio: WAV MP3 Text: TRN CHAT

SBC051 New Yorkers Anonymous

Conversation recorded before and during dinner, in a private home in Laguna Beach, California. There are four speakers, ranging in age from mid forties to early fifties. Sean and Bernard are a couple, Fran is a long-time friend visiting from New York. Alice is also a friend of Sean and Bernard, but had never met Fran. Discussion focuses on travels, and reminiscing about New York City.

Audio: WAV MP3 Text: TRN CHAT

SBC052 Oh You Need a Breadbox

Phone conversation between family members at Christmas. Andrew and Cindy, a couple in their mid forties in Albuquerque, NM, are calling Andrew's sisters in San Antonio, Texas. Discussion centers primarily on Christmas and Christmas gifts, and topics prompted by recent television news shows.

Audio: WAV MP3 Text: TRN CHAT

SBC053 I Will Appeal

Task-related talk recorded in a small claims court in Santa Barbara, California. This segment consists of a judge pro tem hearing and deciding two cases.

Audio: WAV MP3 Text: TRN CHAT

SBC054 'That's Good', Said Tiger

Public storytelling event recorded after a church potluck in Chicago, Illinois. The speaker, a professional storyteller in her mid forties, tells several stories and interacts with the audience.

Audio: WAV MP3 Text: TRN CHAT

SBC055 The Mama of Dada

Public lecture/forum in Santa Barbara, California. Noted artist and ceramist Beatrice Wood gives a public lecture at the Santa Barbra Museum of Art, shortly after her 101st birthday. Wood talks about her life and answers audience questions.

Audio: WAV MP3 Text: TRN CHAT

SBC056 What is a Brand Inspection?

Face-to-face conversation recorded on a ranch near Colorado Springs, Colorado. Julie has recently bought a pony from Gary's wife, and is giving him a bill-of-sale. She then gives a brief tour of her property and barn.

Audio: WAV MP3 Text: TRN CHAT

SBC057 Throw Me

Task-related talk, a recording of a judo class in Shreveport, Louisiana. The five students and their instructor are males between the ages of 22 and 37. The instructor is demonstrating and coaching the Hane-Makikomi throw, which students are practicing with varying degrees of success.

Audio: WAV MP3 Text: TRN CHAT

SBC058 Swingin' Kid

Face-to-face conversation recorded in a private home in Boise, Idaho. Sheri, a single mom in her mid thirties, and her son Steven (age 11) talk while Sheri prepares dinner.

Audio: WAV MP3 Text: TRN CHAT

SBC059 You Baked

Face-to-face conversation, recorded in a family home near Beloit, Wisconsin on Christmas Eve. Cam and Fred are a couple in their early thirties. Jo and Wess are Cam's parents. Topics include talk about family and friends, a football game which Wess and Fred had just finished watching, and holiday baking.

Audio: WAV MP3 Text: TRN CHAT

SBC060 Shaggy Dog Story

Face-to-face casual conversation recorded in an office in Shreveport, Louisiana. The two speakers, Jon (age 72) and Alan (age 66) are friends/co-workers taking a break from work. Alan is primarily telling Jon about his travel adventures and interests.

Audio: WAV MP3 Text: TRN CHAT

Citation

To reference the Santa Barbara Corpus as a whole, the following bibliographical model may be used:

Du Bois, John W., Wallace L. Chafe, Charles Meyer, Sandra A. Thompson, Robert Englebretson, and Nii Martey. 2000-2005. Santa Barbara corpus of spoken American English, Parts 1-4. Philadelphia: Linguistic Data Consortium.

To reference individual parts of the Santa Barbara Corpus, the following bibliographical models may be used:

Du Bois, John W., Chafe, Wallace L., Meyer, Charles, and Thompson, Sandra A. 2000. Santa Barbara corpus of spoken American English, Part 1. Philadelphia: Linguistic Data Consortium. ISBN 1-58563-164-7.

Du Bois, John W., Chafe, Wallace L., Meyer, Charles, Thompson, Sandra A., and Martey, Nii. 2003. Santa Barbara corpus of spoken American English, Part 2. Philadelphia: Linguistic Data Consortium. ISBN 1-58563-272-4.

Du Bois, John W., and Englebretson, Robert. 2004. Santa Barbara corpus of spoken American English, Part 3. Philadelphia: Linguistic Data Consortium. ISBN 1-58563-308-9.

Du Bois, John W., and Englebretson, Robert. 2005. Santa Barbara corpus of spoken American English, Part 4. Philadelphia: Linguistic Data Consortium. ISBN: 158563-348-8.

Recordings

Most of the audio recordings were originally made on Digital Audio Tape (DAT), recorded in stereo at 32 kHz or 48 kHz, on Sony TCD-D6 or TCD-D7 portable DAT recorders, using small, high quality stereo microphones. (A few early recordings were made on high quality analog cassette recorders.)

The audio data as published by the Linguistic Data Consortium consist of 16-bit, stereo, 22.05 kHz audio files in WAV format (PCM).

Personal names of speakers on the recordings, as well as other identifying information such as telephone numbers, have been replaced by pseudonyms in the transcripts, and have been altered to preserve the anonymity of the speakers by filtering the audio files to make these portions of the recordings unrecognizable. Pitch information is still recoverable from these filtered portions of the recordings, but the amplitude levels in these regions have been reduced relative to the original signal. A separate filter list file (e.g. SBC001.flt) associated with each transcription/waveform file pair (e.g. SBC001.trn, SBC001.wav) is provided to list the beginning and ending times of the filtered regions. (The file SBC040.flt is empty indicating there was no personal information to filter out.)

The filtering was done using a digital FIR low-pass filter, with the cut-off frequency set at 400 Hz. The effect of the filter was gradually faded in and out at the beginning and end of the regions over a 1,000 sample region, roughly 45 milliseconds, to avoid abrupt transitions in the resulting waveform.

The following additional files are included on the published CD’s and DVD’s from the Linguistic Data Consortium:

segment.txt explanation of the information contained in segment.tbl
segment.tbl information about the speech event context
segment_summaries.txt brief summary of the content of each discourse segment
speaker.txt explanation of the information in speaker.tbl
speaker.tbl speaker demographic information
table.txt description of file names and informal titles

annotations.txt

list of conventions and prosodic annotations

 

Acknowledgements

Major funding for the creation of the Santa Barbara Corpus of Spoken American English was received from the National Endowment for the Humanities in the form of a grant [Grant #RT-21433-92] to Wallace L. Chafe, John W. Du Bois, and Sandra A. Thompson of the UCSB Linguistics Department, and Charles Meyer of the University of Massachusetts, Boston. The initial phases of the project to develop the Santa Barbara Corpus were made possible by a series of grants awarded to Chafe, Du Bois, and Thompson by the Interdisciplinary Humanities Center, the College of Letters and Science, and the Office of Research, all of UC Santa Barbara. Additional funds were received from the Linguistic Data Consortium at the University of Pennsylvania. The completion and release of Parts 2-4 of the Santa Barbara Corpus was facilitated by funding extended by Talkbank, an interdisciplinary research project funded by a grant (BCS-998009, KDI, SBE) from the National Science Foundation to Carnegie Mellon University and the University of Pennsylvania.

Contact

For more information about the Santa Barbara Corpus, contact:

John W. Du Bois, Director
Santa Barbara Corpus of Spoken American English
dubois@linguistics.ucsb.edu