Present: Ronald Jantz (chair), Isaiah Beard, Anne Butman, Tom Frusciano, Judy Gardner, Michael Giarlo, Dave Hoover, Patrick Huey, Linda Langschied, Sam McDonald, Ann Montanaro, Lynn Mullins, Bob Nahory, Jeffery Triggs, Karen Wenk, and Yang Yu (recorder).
1. Current Status of DAWG/NJDH Infrastructure Development
- Fedora 1.0 installed, has been running all the time in June.
- Testing of Amberfish searching of Fedora metadata (Dublin Core) sample collections went well for: NJDH, PCSP, Eagleton.
- Fedora uses DC as the default metadata schema. Application of other metadata schema, such as MPEG-7, needs to be investigated.
- Generated basic METS-XML metadata model for images.
- Implemented initial Fedora management system prototype.
- Implemented initial workflow management system design and prototype.
2. Discussion/Review of assignments:
These assignments resulted from the April 29 DAWG meeting. The objective is to have basic capability in all areas by M0
(month zero - November 1, 2003) in preparation for M1 (November, 2003), which is the beginning of the NJDH grant. Grant
members begin digitizing in M2 (December, 2003).
- Digital imaging standards, procedures, and training. Isaiah Beard (lead) with Sam McDonald, Nick Gonzaga, and Judy
Gardner. Develop documents describing standards, workflow, and training to digitize all types of objects including
single page images, books, multimedia (video), etc. Document must include equipment standards. Devising a training
outline, and a "Train the Trainer" program. Instructions for equipments should be focused on SCC facilities. The
documents should be written in modular form so that new sections on new equipments or format standards can be easily
added.
- Digital imaging workflow system. Patrick Huey (lead) with Jeffery Triggs, Yang Yu, and new part-time student. Develop
a workflow system that would support and model the various stages of workflow in a database and support the unique
characteristics of the more complex digital objects. It must support users working remotely working with a variety of
scanning devices. The end result of this workflow should be a digitized object. Patrick Huey demonstrated a prototype of
a web based workflow system, which has basic metadata (DC) input template and user authentication functionalities.
Issues raised from this meeting include synchronization of the data between MySQL database and Fedora objects, treatment
of the Collection ID's in the implementation, as well as which metadata schema should be used and their relationship to
the ingest process. Note that we decided that this group should also undertake the task of developing some editing
scenarios that would be presented to the full committee. Issues include the persistence of the workflow database and how
we will edit objects directly in the Fedora environment.
Digital Object Workflow diagram
- Web portal development. Patrick Huey (lead). Develop specific portals for NJDH including public, administrative, and
content, format or organizationally restricted portals. Authentication should be handled through Shibboleth. This task
is the only one that is NJDH specific however the concepts and much of the work will be reusable for other projects.
Patrick Huey demonstrated a prototype web design for the NJDH portal.
- Ingest. Jeffery Triggs (lead) with Patrick Huey, Yang Yu, and new part-time student. Provide a graphical user
interface that would allow a user to identify the object to be ingested and to specify the required metadata (both
descriptive and preservation). The output would be an XML file that could be directly ingested into Fedora.
- Federated searching. Jeffery Triggs (lead). Provide generic search capabilities to collections represented in Fedora.
This work includes selecting a search engine that will provide both metadata and full text searching, integrating a
Fedora search with other repositories, implementing portals that can be restricted by organization, content, and format,
as well as building an interface to the authentication software.
- Fedora management interface. Jeffery Triggs (lead) with Michael Giarlo. The interface should provide basic capability
for managing objects in a Fedora repository. The end product should allow web access to all of the API-M commands:
ingest, purge, export, and find.
Jeffery Triggs demonstrated the preliminary implementation of the user interface to the data ingestion, federated
search, and Fedora management system, which has basic ingest/view/index/export/purge functionalities. Suggestions have
been made to make the interface as simple and user friendly as possible.
- Fedora infrastructure. Michael Giarlo (lead) with Dave Hoover. Install and support the Fedora system, consult with
application developers to help them learn about the internals of Fedora. Michael Giarlo reported that Fedora system v1.0
has been installed and running in SCC.
- Authentication. Dave Hoover (lead) with Michael Giarlo. Explore techniques for interfacing to local directories. Dave
Hoover suggested that Fedora is investigating Shibboleth and we should closely monitor what Fedora will do to support
Shibboleth. We need to better understand the authentication requirements for NJDH.
- CNRI Handle. Karen Wenk (lead) with Michael Giarlo and Jeffery Triggs. Building on the CNRI Handle system, provide for
approaches to assign handles (persistent Ids) automatically to objects that are deposited in Fedora. This work should be
done in consultation with the Fedora project (they are considering the use of CNRI Handle as an external persistent ID).
As an early test of this process, the work should include the exporting of journal articles from the Journal Framework
and ingesting the articles into the journal collection for Fedora with the automatic assignment of CNRI Handles. CNRI
Handle has been installed on SCC Linux server. This work should also include the determination of the handle syntax for
NJDH and how handle assignment will be integrated with the workflow system and Fedora, with special attention to the
editing of objects.
- Modeling of complex objects. Ann Montanaro (lead), Bob Nahory, Jeffery Triggs; via NJDH: Chad Leinaweaver, Dan Noonan,
Kayo Denda, Tom Frusciano. Determine how to represent the collection in Fedora including collection (or sub-collection)
structure and how to model the structure and hierarchy of an object.
- Importing of material into Fedora from existing digital collections. Tom Frusciano (lead) with Patrick Huey. Starting
with Electronic New Jersey and the New Jersey Environmental Digital Library, determine if and how we should import
existing collections into Fedora. Issues raised in the meeting include NJEDL is probably not a good candidate for the
test due to the lack of metadata and some other problems. Some existing digital collections are stored in Access
database at local computers, which are not linkable. Ron suggested that this task needs further discussion and will be
postponed at least for now.
3. DAWG/NJDH Infrastructure
Management Team
- Ronald Jantz
- Linda Langschied
- Ann Montanaro
Development Team
- Patrick Huey - Development team coordinator
- Jeffery Triggs - Application developer
- Mike Giarlo - Infrastructure and authentication
- Isaiah Beard - Scanning and workflow
- Dave Hoover - Infrastructure and authentication
- Yang Yu - Metadata
- Ann Montanaro - Object architecture
4. Metadata
- Ron explained METS XML for a simple object that has been ingested into Fedora. Issues: Its administrative metadata section has been pulled together from various sources. Considerable work needs to be done to have a workable set of administrative metadata. MPEG7 has the meta metadata built in. Using MPEG7 as the metadata schema may make the problem an easier task. A collection level object in Fedora has not been established and we need to determine what collection level metadata we will needed.
- The Fedora default is to use Dublin Core as the descriptive metadata schema.
- We need to have the capability of handling different metadata schema. Need to know how to handle MPEG-7 and other descriptive metadata within Fedora. - in a simple object, the METS structure map has no purpose. As we move to complex objects, we will need to determine how to use the structure map and how it relates to the object structure that we set up in Fedora. Grace has suggested possibility of using a subset of MPEG7. We will need to explore this further and bring people from Data Architecture Group (Rhonda?) in and perhaps form a sub team to work on the metadata issue.
5. Near term tasks
Throughout our discussion, we identified specific near term tasks that should be undertaken. These tasks are identified below and should be addressed by the lead person or sub-group identified. Progress on these items should be discussed at the next DAWG meeting.
- Ingest a complex object into Fedora, e.g. a book with many pages. (Triggs)
- Create a Fedora collection object for NJDH and point to the objects in the collection (complex object sub-group).
- Modify the Fedora search to find objects in NJDH via the collection object. (Triggs)
- Ingest a simple object using MPEG-7 as the metadata schema. (Nahory, Yang)
- Do a test to determine the impact on Fedora of using a non-DC metadata scheme. (Giarlo)
- Add MPEG-7 to the workflow management system. (Huey)
- Decide how (if?) we are to use the METS structure map. (We can represent structure explicitly by how we model an object. A structure map might not be useful and may only create additional metadata.)
- Implement full text searching of objects that have text associated with them. (Triggs)
- CNRI handles. a) determine when handles should be created, b) how handles should be treated when an object is deleted, c) make sure handle assignment is consistent with the editing process, d) determine if we need a separate handle prefix for NJDH (my opinion - we don't since RUL is the archival agent for NJDH), e) determine handle syntax for NJDH (e.g. 1782.1/njdh.[collection].[subcollection].[name]). (Wenk, Triggs)
- Define candidate scenarios for how objects are to be edited. There are several related issues here as follows: a) we currently do not have the capability to directly edit a Fedora object, b) until an object is ingested, editing will occur in the workflow management system. How long will the object persist in the workflow system? c) should we consider deleting the object from workflow after ingest and then developing the capability to edit Fedora objects? (assigned to workflow subgroup).
- Create an NJDH subgroup that will focus on defining the metadata schema to be used. (Montanaro, Langschied)
6. Related Items
- Ann Montanaro will take over as co-chair for the next year, starting July 1, 2003.
- Bob Nahory will be added to the DAWG/NJDH infrastructure development team (see above for current members).
- From now on, DAWG meeting will be held at 9:30 am every third Wednesday of each month.
- The next meeting is scheduled for 16 July.
- Next recorders are: Isaiah Beard, Anne Butman, Tom Frusciano
- We will setup a DAWG listserv