This repository was archived by the owner on Mar 12, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2
Summer 2017
jonathanchu78 edited this page Sep 14, 2017
·
9 revisions
- Sign up for Library Slack at https://uclalibrary.slack.com/signup (existing members will add you to the #builducla-ccing channel)
- Send your github account to an existing member to be granted access to the repo (https://github.com/UCLALibrary/CC-ing)
- Pete will email you other info as needed
- Look through the documentation and code!
Slides from CCing progress report, June 2017: https://docs.google.com/presentation/d/1pGWDBy5ff5xHdb0Ys462BXRf0IjP2zNrlt4sBfYiAvw/edit#slide=id.p
Jonathan's Notes:
- notes about the Scribe data model (useful for figuring out how to integrate OCR/translation data)
- notes about Scribe mark classifications (useful for figuring out how to integrate OCR/translation data)
- notes about constructing a request with mark data to Scribe
- Translation API implementation (probably Python client-server coding -- need to choose an API (probably Google), figure out how to query it, etc.)
- Creating a test image set/developing workflow for uploading
- Continuing work on using the Rails API interfaces to insert transcription data from Tesseract OCR into Scribe
- Extending Scribe DB schema to include all of the catalog fields, as well as duplicate fields for each field's translation (MongoDB, Ruby/Rails)
- Exporting data from Scribe and formatting it as catalog input records (probably Python, JSON, XML, MongoDB?)