Text Format Descriptions: Full Verbatim
Text Format Descriptions: Full Verbatim
FULL VERBATIM
The text is transcribed exactly as it sounds and includes all the utterances of the speakers.
Those are:
CLEAN VERBATIM
The transcribed text does not include:
Speech errors
False starts (unless they add information)
Stutters
Repetitions. Note: Keep repetitions of words that express emphasis: No, no, no. I
am very, very happy.
Filler words: Words often excessively used by the speaker but when you take them
out, you’re left with perfectly understandable sentences. uh, um, *you know, *like, *I
mean, *so, *kind of, well, sort of… Be mindful of the context. Some of these filler
words do not always function as filler words.
Expressions should be kept regardless of verbatim type: Oh my God, Oh dear, Oh
my, Oh boy, Oh, et cetera.
Slang words must be written as "got you" instead of "gotcha", "going to" instead of
"gonna", "want to" instead of "wanna", "because" instead of "'cause" et cetera.
"Yeah", "yep", "yap", "yup", "mm-hmm" must be written as "yes"; "alright" must be
written as "all right."
Never spell "Ok" or "OK." It must always be spelled as "Okay."
Avoid starting phrases with conjunctions in clean verbatim. If you really need to add
the conjunction, just expand the sentence. For example: "I went outside but forgot to
bring my umbrella."
Note: For CV: Omit all the "yeah", "yes" reactions to retain a fluent text, unless they are
answers to given questions.
DO NOT remove filler words if they change the meaning of the phrase.
FV EXAMPLE:
NOTE: If there's a comment next to the audio file saying, "Please use the embedded time"
or "burned-in time," you will need to download the file in order to watch the video and use
the correct time.
MAJOR RULES:
1. If you cannot hear what word is being said, mark that as inaudible or unintelligible
and specify the time. Do NOT make up your own markings. Only use [inaudible
00:00:00] and [unintelligible 00:00:00]
Use [inaudible 00:00:00] when speech cannot be heard due to poor
recording or noise (keyboard shortcut: Ctrl + K).
Use [unintelligible 00:00:00] when speech can be heard but it cannot be
understood due to the speaker's manner of speech, accent, et cetera (Ctrl +
I).
2. When a speaker is using conjunctions like "and", "so", or "but" to connect longer
stretches of thought, it's often a good idea to create sentence divisions in those
places. Also, don't forget to cut out the conjunctions in those places when
they're not necessary.
3. Longer speeches should be separated into smaller paragraphs. Paragraphs
shouldn't be longer than 500 symbols (about 100 words or 3-4 lines in the
transcription tool).
4. Never paraphrase or reconstruct the speech in the audio you are transcribing.
5. Do not correct grammatical errors made by the speakers.
NOTE: Do not use [sic] tag
6. Always use the correct spelling for misspoken words.
Mark: Hello.
Speaker 1: Some text.
Speaker 2: Some more text.
14. Occasionally customers dictate instructions to format the transcription while they are
speaking. These instructions should be followed when possible, but never
transcribed. Follow customer requests for spoken directions such as a new
paragraph, comma, period, or a bullet point (use a dash). Do not type out the
instruction.
15. Italicize film, book, magazine, song titles, as well as artworks, plays, TV and radio
programs, foreign expressions et cetera. Example: I watched an episode
of Friends the other day.
Right: USA, PhD
Wrong: U.S.A., Ph.D.
Right American English: Dr., Mrs.
Right British English: Dr, Mrs (without the period)
17. Always research the proper capitalization e.g., iPhone, UCLA, SaaS
18. Always write links like this: www.facebook.com/groups/gotranscript. Never write it
like this: w w w dot facebook dot com slash groups slash gotranscript
19. Sound events
Sound events that are significant to the audio should also be noted. Use
brackets [ ] for notes. The notes are always written in lower
case regardless of the position in a sentence.
Sounds that the speaker makes are always on the same line and always in
the present tense. [snaps fingers] [phone rings] [laughs] [chuckles] [giggles]
[scoffs] et cetera. [laugh] is a normal laugh; [chuckles] is a soft laugh.
Sounds not made by the person speaking are always on separate line
[present and gerund]: [applause] [cheering] [chuckling] [laughter] [phone
ringing] et cetera.
Use [background noise] on a separate line for ambiance noise. Use
[background noise] on the same line if a significant unidentified sound
occurs while the speaker is talking.
[crosstalk], [silence] - can be placed on a separate line or same line wherever
they occur. [silence] is used to demonstrate a short pause in speech; not less
than 4 seconds but not longer than 10 seconds.
[pause 00:00:00] bolded and time-stamped is used to demonstrate a pause
significant in a speech. It must be longer than 10 seconds for it to be marked.
It is always is put on a separate line.
When the audio is cut or edited, use [sound cut] on a separate line or the
same line; wherever the sound cut was done.
If a foreign language or a word (in this case, a language that is not English) is
spoken, mark it as [foreign language] or [French language], [German
language], et cetera if it can be identified.
20. Numbers
Spell out single-digit numbers, use numerals for all other numbers: zero, nine,
10, 1492.
Money: $1, $1.5 million, $1,000 (1 grand is 1,000, 5 bucks is $5, 8 quid is £8.
Half a million dollars is $500,000).
Years and eras: '90s, 1990s
Age: 70s, 30s
Percentages: 0.2%, 100%
Measurements: 3 degrees, 12 feet, 8 centimeters, 7 pounds, 1.5 kilos, 28
square meters
Mathematical equations and formulas: x = x + 2 or x ^ 3 = 8
Bible citation: John 1:2–3
Fractions: 1/3
Postal code: 91210
Phone number: 123-456-789
Combination: If a sentence combines small [0-9] and large numbers [10 and
up], transcribe all numbers in numerals.
21. Times of the day and dates: always capitalize AM and PM. Do this: 2:45 PM, 5:00
AM. When using o'clock, spell out the numbers: eleven o'clock.
22. Double dashes or a single dash
Use double dashes -- when there is a change of thought (false start) or a
speech error, or to mark an incomplete sentence. Do this:
FV Speech error: I went to the bank on Tu-Thursday-- no, Friday.
FV False start: I, um, wanted-- I have dreamed of becoming a
musician and--
CV False start that adds to information: Sage is-- You’re right, that
boy is my son.
INCOMPLETE SENTENCE regardless of verbatim type:
I wanted to say something but--
Are you done with that or--
Use single dash -
When the speech is interrupted in a conversation, but the speaker
continues his thought. Do this:
27. Do not remove the word et cetera unless the client asks otherwise in the comment
section.
28. If you do not prepare the transcriptions according to these requirements, you
might be removed from the team of transcribers.
Ratings which are given by editors:
5 - from 96% to 100% accuracy
4 - from 92% to 95% accuracy
3 - from 88% to 91% accuracy
2 - from 83% to 87% accuracy
1 - from 0% to 82% accuracy
Transcribers should know that mistakes like the following will be harshly
penalized by editors. Along with accuracy, editors will be rating your files
based on your grammar mistakes and/or lack of research.
If a new transcriber finishes 3 transcriptions and has 3.6 or a lower average
rating, he/she will be removed from the team.