Demonstration Draft Specifications (March 2005)

March 17, 2005

 

Draft Technical Specifications for

University of California Digitization Project

 

File Creation

Scanning: (includes deskewing, cleaning, cropping etc.)

preservation file: 8-bit grayscale, 600 pixels per inch, tif file format

current use file: 8-bit grayscale, 400 pixels per inch, tif file format (or jpeg file format, if cheaper: medium compression level (8 out of ten)?)

Paper-white level: 250-254

sharpening: moderate

OCR without zoning on 400 ppi files: (raw ocr, without any cleanup)

PDF Creation from 400 ppi files

 

 

Filenames

The Call Number will be supplied by UC Berkeley with the book and will be used as the prefix to the filename, replacing blanks with underscores. For example, F864 .M34 1970, becomes F864_.M34_1970.

The tiffs of the individual pages will use the call number followed by the page number, left padded with zeros to 4 digits, e.g., 0001.

Examples:

The pdf: F864_.M34_1970.pdf

Individual pages: F864_.M34_1970_0001.tif

F864_.M34_1970_0002.tif

F864_.M34_1970_0003.tif

F864_.M34_1970_0004.tif, etc

 

Technical Metadata

Assuming that all the images in one book are scanned on the same machine we only need the following technical metadata per book. If all books are scanned on the same machine we only need the following technical metadata for the entire project. If a value is needed that is not listed here, please contact us to get it added.

Tiffs/Images:

Type Controlled vocabulary, pick onedigital still camera

 

reflection print scanner

transmission scanner

BrandFree textExamples: Phase One, Epson, Nikon
ModelFree textExamples: PowerPhase, 836xl, LS-2000
Serial NumberFree textExamples: AK001109, 8204058, 212931
Bit DepthControlled vocabulary, pick one1 (1 bit bitonal)

 

16,16,16 (TIFF, HDR)

4 (4 bit grayscale)

8 (8 bit grayscale or palletized color)

8,8,8 (RGB)

8,8,8,8 (CMYK)

IlluminationControlled vocabulary, pick oneD55 Illuminant

 

D65 Illuminant

D75 Illuminant

Daylight

Flash

Fluorescent

Standard Illuminant A

Standard Illuminant B

Standard Illuminant C

Tungsten Lamp

ColorSpaceControlled vocabulary, pick one0 (Grayscale, White is Zero)

 

1 (Grayscale, Black is Zero)

2 (RGB)

3 (Palette Color)

4 (Transparency mask)

5 (CMYK)

6 (YCbCr)

8 (CIELab)

File FormatControlled vocabularytif (standard for master image)
CompressionControlled vocabulary, pick one1 (Uncompressed)

 

2 (CCITT 1D)

3 (CCITT Group 3)

4 (CCITT Group 4)

5 (LZW)

6 (JPEG)

7 (PackBits)

Color Profilefree textName of a well-known profile

 

Example: Adobe RGB, Colormatch RGB

 

If a digital camera is used, we also need:

Filter0 if none, name of filter if used
Glass0 if none, information on glass if used

 

 

PDFs:

 

Conversion SoftwareFree text (conversion software)Example: Acrobat PDF for Word 5.0
Hardware Hardware Platform used to create PDFExample: Microsoft Windows 2000 Professional
OCR Conversion SoftwareFree text (software used to create OCR)Example: ABBYfinereader 6.0
OCR HardwareHardware Platform used to create OCR Example: Microsoft Windows XP
LanguageUsed in documentExample: English

 

 

File Delivery

 

Files will be delivered to UC Berkeley Digital Publishing Group (DPG) on CD or DVDs (vendor choice). The tiffs should be grouped together on one set of disks and the PDFs will be on another. The technical metadata can be sent as email attachments or on the disks with the PDFs. The vendor will notify DPG when the disks have been sent.

 

DPG will have 90 days to review the files and contact the vendor about problems.