Nineteenth-Century Knowledge Project Jump to main content Nineteenth-Century Knowledge Project Introduction Organization OCR XML Entry Files Reference Index Search IntroductionWhat is the Nineteenth-Century Knowledge Project? AboutAcknowledgements for all contributors. OrganizationHow we keep hundreds of thousands of files organized. Edition-Section SystemFile organization depends on two basic folder types Folder namesAs the OCR workflow passes through its various stages, production moves into specific folders for each stage. Their names and contents are given below: RepositoriesA guide to the different repositories used to store ocr-project data. Setting Up the RepositoriesCreate local copies of the remote repositories OCRThe procedures we use to get the best quality text recognition in ABBYY Fine Reader. AFR InterfaceLearn about the main elements of the program interface Create a Page-Inventory FileCreate a page-inventory file. Create an Image CollectionOrganize image files for scanning. Create an OCR-ProjectHow to create and manage an OCR-Project. SettingsRecommended settings for all options in ABBYY FineReader Draw BoxesManually creating text recognition boxes improves accuracy Page RecognitionExcellent page recognition depends on preparing pages properly. Save and OutputHow to output your OCR results. XMLThis introduction to Oxygen XML Editor shows you how to navigate the interface and perform standard procedures on the Encyclopedia files. Oxygen InterfaceAn introduction to the main components of the Oxygen interface. Create an XML-ProjectUsing Oxygen XML Editor to organize files. Transform DOCX to TEIHow to convert DOCX files to TEI in Oxygen. Entry FilesProcedures for converting single pages into Encyclopedia entries. Convert Page to Entry FilesBefore page files can be converted to entry files, we need to do some housekeeping. Entry-Inventory FileDocument the filenames of every entry in a section using the entry-inventory file. Validate Entry FilesUse Oxygen to validate the entry files. ReferenceReference information on file/folder names, TEI-encoding standards, and unicode characters. Editorial standardsThe following editorial principles are employed in creating this digital edition. Image SourcesBibliographic information on print editions and image repositories. Naming ConventionsLists the naming conventions we use for editions, sections, folders, and files. TEI Style ManualAll TEI encoding must follow these guidelines. Unicode CharactersList of unicode characters and entities used frequently in the Encyclopedia and not on the standard US keyboard. Generated by XML WebHelp Project Director Peter Melville Logan National Endowment for the Humanities HAA-261228-18