Newspapers, OCR'd Teasers, and Make-Do Citations



By Elizabeth Shown Mills

For researchers, the Internet is an all-in-one paradise, purgatory, and hell.  It tempts us with wondrous things packaged in confusing ways. When mishandled or misread, it can create results for which some researchers will damn us forever. A recent query to EE’s Citation Issues Forum makes the point. We might paraphrase the question this way:

How do you cite a website that gives you only a teaser—Twitter-sized snippets from many separate articles all run into one jumbled mass?

At this point, you might also be asking, “Why? ... Why would we even want to cite the site?”

Our inquirer has self-identified as a non-user of EE and a non-subscriber to the website that offers these teasers to entice people to join. Joining the site to access the full text of relevant newspaper articles is not an option for her, for the time being. But in those teasers, she sees nuggets of information she would like to capture and add to her database.

Are we now at the point of thinking, "Well, but ..."?

Our inquirer’s proposed citation shows that she understands the basic format for citing newspaper articles and (mostly) the basic format for citing websites.

But, of course, reliable citations and reliable research require more that just knowing canned formats that can be plugged together—or having a handy-dandy template in a software program to do the mental drudge work for us.  There are also core principles we need to understand, for both citation and analysis. Otherwise, we’ll find ourselves dangling over the research equivalent of hell-fire and brimstones. 

The case at hand involves these three principles:

  1. We cite what we use; if we don’t actually use it, we don’t cite it. (EE 2.21)
  2. We add qualifiers in every place where uncertainty exists. (EE 1.6)
  3. Citing index entries or garbled online snippets should be only a stop-gap measure—a placeholder until we can access the real deal. (EE 2.12)


The Source

The actual text offered by the website is pasted below. The one name-of-interest is boldfaced:

Page 2 THE DAILY HERALD Tuesday, Feb. 1, 1955 OIL Development in Dubois County By Virgil W. Kays of Evansville At the present time there are two very interesting wildcat tests underway in Dubois county. Both are located about eight miles southwest of Huntingburg, just above the Spencer County line. Carl E. Michel of Dale is drilling at 8C8 feet on his No. 1 Elmer Roesner in section 35-3s-6w, two miles south of Holland. This test, being drilled with the cable tools of Dobbs & Vickers of Owensboro, is going to the McClosky limestone. About two miles southeast of Holland in section 30-3s-5w, A. B. Barrow of Evansville is drilling at 571 feet on the No. 1 Wm, Land. Lightfoot & Siddons of Rockport are the cable tool contractors. The No. 3 Wm. Struckman is on production but is pumping mostly water with only a few barrels of oil a day. This well, located in section 17-3s-5w, five miles south of Huntingburg, is being operated by H. C. Henderson of Jasper. REPEATED BY POPULAR DEMAND... TURN YOUR 01D WASHERS to ABMSTRONG’S STORE ADD Gfl A DOUBLE TRADE-IN ON A HEW WASHER! You’ve Always Wanted a Maytag. Now Is the Time to Buy. Never Again Will Your Old Washer Be Worth as Much! DON'T MISS THIS! IF YOU CAN’T COME IN, GIVE US A RING We’ll be at your house and make you an offer . . . WITHIN 30 MINUTES THURSDAY-FRIDAY-SATURDAY ONLY! ARMSTRONG'S STORE IN THE Y PKONE 41 YVTTV-T V Wednesday, February ‘-i, 1 7 :00 Today !>:00 Dinst l)ons School 0:”.0 Foy Willing !»: 1 .*> Sheilah Graham 10:00 Home 1 1 :00 Tennessee Ernie Ford 11:510 Featht r Your Nest 12:00 K. F. I). 1 12:30 Garry Moore 12:4a Les’s Cartoons 1:00 Country Matinee 1 :*10 The Earth Story 2:00 The Greatest Cift 2:1a (¡olden Windows 2:510 One Man’s Family 2:15 Concerning Miss Marlowe 51:00 Hawkins Falls 3:1a I. 1. Heeital Hall 3:510 World of Mr. Sweeney 51:1a Modern Homanees 5:00 Western Ledger 1:5?0 Howdy Doody 5:00 Milt’s Music Mart (5:00 Front I’age News 0:15 Bud Wilkinson (»:510 Eddie Fisher 0:15 News Caravan 7:00 I Married Joan 7:510 Royal Playhouse 8:00 Inspector Mark Saber 8:30 Secret File, t.S.A. 0:00 This Is Your Life 0:510 Big Town 10:00 Norby 10:510 Tonight In Indiana 1 I :00 Tonight 12:00 Sign Off 7:00 0:00 0:5!0 0:45 10:00 I 1 :00 11 :30 12:00 12:510 I :30 2:00 2:1 r> 2:30 2 : 15 3:00 51: 1 5 3:510 3:45 4 :00 1:510 5:00 5 : I 5 0:00 0:30 6:43 7 :0O 7 :se 8:00 0:00 0:510 10:00 10:510 10:515 1 2 :05 W.VVE-TY Wednesday, February Today Ding Dong School Wav of the Wor!u Sheilah (¡raliam Home Tennessee Ernie Feather Your Nest Funny Flickers Movies at Midday Ladies’ Fare The Greatest Gift Golden Windows One Man’s Family Concerning Mis«. Marlowe Hawkins Falls First Love World of Mr. Sweeney Modern Romances Pinky Lee Show llowdy Doody Sundown Theater Three S*:»r Final It’s A (¡reat Life Eddie Fisher Show News Caravan I Married Joan My Little Margie TV Theater This Is Your Life Liberace TV Readers Digest Do You Know Why? Starlight Theater Late News and Weather 1955 Robert Skaggs, Christian Seitz, Alvenia Mathies, Donald Meyer, Norma Lee Neukam. Janet Bleemel, Lowell Dorsam, Ellateen Voelkel, Barbara Zehr. Second Honor (3 A’s, Not Below B) Shirley Hall, Janet Hodges, Gene Neukam, Frances Patcheak, Shirley Bonifer, Linda Buchta, Joan Dorsam, Florence McGuire, Alice Sander, Bill Skaggs, Linus Bauer, Helen Dorsam, Sharon Perkins, Edna Zehr, Third Honor (2 A’s, Not Below B) Shirley Nordhoff, John Popp, Tom Reck, Diane Rohrscheib, Mary Siefrig, Norman Keller, Donald Patcheak. NEW CINCINNATI COACH CINCINNATI —INS— Army backfield coach George Blackburn has been named head football coach at University of Cincinnati. Blackburn, 41, replaces Sid Gillman who became head coach of the Los Angeles Rams of the National Pro Football league. Your Best Buy! BARTON Quality WASHERS For over 25 years dependable, big values. Easy Terms JOE STEMLE Home Appliance Center Phone 702 DUBOIS HIGH SCHOOL NOTES New G-E DRYER- CON DITION ER There is a lot of color around the halls of Dubois High School since the Sophomores have received their beautiful blue and white jackets. The Junior Class has selected their class play which will be presented in April. It is “A Date With Judy”, with a cast of nine girls and four boys. The seniors received their pictures from the Bilrnar Studios and most of them were quite flattering. They have also received their name cards. The members of the senior class will spend a day at the state Legislature on February 9. They will go by Bluebird Bus. The senior members of the Beta Club will attend the ‘‘Holiday on Ice” show at the Louisville Armory on February 6. The basketball team will attend the Indiana University—vs. Wisconsin basketball game at Bloomington on February 7. The Honor Roll for the 1st semester follows. First Honor, All A’s Mary Ann Baer, Geraldine Nonte, Modfl DA623M Drys • Fluffs • Sprinkles • Refreshes Clothes ELECTRICALLY OEIUXE FEATURES FOR CAREFREE WASHDAYS • Simple Dial and Pushbutton Controls • Automatic Sprinkler • G E Ozone Lamp • Either 115- or 230-volt Operation Whenever you’re in for a gas-up, we quickly check those little details that call for periodic attention — tires, battery, etc. ... at no extra charge . . . STOP HERE ONCE . . . YOU’LL COME BACK OFTEN And for ROAD SERVICE ... PHONE 914 . . . We'll be there in no time at alL mehling & Dischinger Standard Service AT THE “TOP OF THE Y” JASPER ’iirmmrafii rrTriinr'rTi~7TUTi~i ,i wiiTwrniii ¿ wii

The inquirer would like to “build the reference to the fact that Helen Dorsam was on the 2nd honor roll at Dubois High School (3A’s none less than a B.”  The web page also offers a thumbnail image of the newspaper page; and our inquirer feels she can discern that the item of interest “was in col 4 & likely continued into col 5.”  On that basis she proposes this citation:

     1. "Dubois School Notes," The Herald (Jasper, IN), 1 Feb 1955, p. 2, col. 4 & 5; OCR text and small digital image ( : accessed 9 May 2015).

“But,” she continues, “that doesn't cite the specific person. ... Do I really want to cite one person out of a list?”


The Real Issues

For the time being, we’ll put that final question on hold. What’s critical are the three numbered issues we introduced above the OCR’d text:

  1. We cite what we use; if we don’t actually use it, we don’t cite it. If our source identifies its own source, then we add a note (or a second layer) to our citation to say that our source has cited thus-and-such but we do not, in any way leave the impression that we took our information from the source we did not actually use.
  2. We add qualifiers in every place where uncertainty exists.
  3. Citing index entries or garbled online snippets should be only a stop-gap measure.  It’s a make-do that’s justified only until we can access the real deal. In the meanwhile, it’s pure folly to accept that “information” at face value.

If we take our “information” from that garbled OCR’d text, which plugs together snippets from each article on the page with no indication of where one item starts and another ends, we are not citing the newspaper


The Citation Challenge

The real citation question here is this:How do we cite these garbled teasers? If EE were to cite a portion of this page—perhaps as an example in a discussion of how reliance on a snippet such as this misleads us into making a wrong conclusion—our citation might look like this:

     1. OCR’d text snippet, ( : accessed 9 May 2015); citing “February 1, 1955, The Herald from Jasper, Indiana · Page 2.”

Our citation, you’ll note, does not include two things that our inquirer included in her proposed citation:

  • Column number. We don’t cite that, because the source we actually used does not cite column number. In this case, squinting at the thumbnail on the web page has led our inquirer to a guesstimate of one, maybe two, columns. But any guesstimate made in a citation should say that it’s guesswork. Sometimes, as with unnumbered pages in a book, it’s justified—so long as the basis for our approximation is explained (thirteenth unnumbered page past figure 4”). In this case, when dealing with one page of a newspaper, it’s simply not necessary and, more seriously, leaves the impression of preciseness where no preciseness exists. All we should do, in this instance, is cite what our source cites.
  • Article title.  The proposed citation presents a title in quotation marks: “Dubois School Notes.”  Quote marks, of course, mean “I’m copying this exactly.”  In this case, the garbled teaser carries the words “DUBOIS HIGH SCHOOL NOTES” but it appears several lines after the item-of-interest. The inquirer feels she can read, from the thumbnail, an article title “Dubois School Notes,” but we’re left with a discrepancy between her proposed title and the words that are actually OCR’d.  Given the inability to enlarge the thumbnail, to attain some certainty, good practice would not call for creating a formal title that has not been verified—especially since a conflicting possibility exist.


The Personal Identification Question

Our inquirer’s final  question is an excellent one: Is it necessary to repeat, in the citation, the specific information about the person-of-interest?  Yes. Absolutely. The whole snippet does not have to be repeated. The narrative discussion to which this reference note would be attached should carry the full details (to the extent they are presently known or surmised) and it should use qualifiers (appropriately chosen “weasel words”) to indicate that the interpretation is still questionable. The reference note, then, would provide a shortened caution.

In the case at hand, our inquirer feels that she has reached the correct interpretation of the garbled teaser. It’s even likely that she has, in this case, because of the structure of the item-of-interest (although making conclusions from most other passages in the teaser would be far riskier).  Even so, a careful researcher would cite this only as a temporary placeholder until the original page is consulted. One way might be this:

         1. OCR’d text snippet, ( : accessed 9 May 2015); citing “February 1, 1955, The Herald from Jasper, Indiana · Page 2”; the garbled text that includes Helen Dorsam’s name seems to say (as suggested in our narrative) that she was on the “2nd honor roll,” but the actual newspaper article has not been consulted.


The Bottom Line:

We cite what we actually use. We add qualifiers in every place where uncertainty exists. And Clio, the Goddess of History, would say that reaching conclusions from flaw-riddled sources is a bit too much flirting with the devil.


Posted 14 May 2015