DoD Learning Content: Search and Discovery
DoD LEARNING CONTENT: SEARCH AND DISCOVERY

Great strides have been made in the creation and use of distributed
content. However, the early promise of re-use has been largely unfulfilled
for both architectural and organizational reasons.
By Paul Jesukiewicz
Great strides have been made in the creation and use of distributed content that benefit learning, education and training. The evolution of learning material organized into relatively small and concise content objects promises new possibilities, including the widespread availability of a growing library of content that can be re-used for multiple purposes. However, the early promise of re-use has been largely unfulfilled for both architectural and organizational reasons.
During the past 10 years, the Advanced Distributed Learning (ADL) Initiative’s Sharable Content Object Reference Model (SCORM) evolved to provide a modular, object-based, design approach for learning content objects, which solved key interoperability issues across many learning systems in industry and government. SCORM has enjoyed widespread international adoption and has become a de facto standard in many communities of practice. While SCORM advances the state of the art in the design and creation of interoperable and reusable learning content, it does not address finding and re-using content once it has been created.
An ADL Priority
Searching for content is an issue that ADL must address. Part of this necessity derives from SCORM’s success. In June, 2006, the Department of Defense mandated that any DoD entity seeking to acquire new, Web-based content must consider, making that content SCORM-conformant. This action was taken to reduce costs, avoid re-creation of existing learning materials, and to enhance their interoperability and reusability. This mandate also requires a search of existing repositories of content to see if equivalent material already exists, before any acquisition can proceed. This latter requirement is a completely reasonable way to reduce costs and avoid re-inventing the wheel. Unfortunately, there is no simple or organized way to do that.
There is no easy way to search the different content repositories in the DoD. There are some systems in place such as DAVIS/DITIS (Defense Automated Visual Information System)/(Defense Instructional Technology Information System) that can be searched for instructional content. DAVIS/DITIS was created to provide information on audiovisual and interactive multimedia instruction products available to support training, command information and operational missions. The system served the purposes of search and discovery of fielded content (videotapes, computer-based training CDs, paper courses and other media), but was not architected for meeting the demands of today’s exponential creation of digital learning resources.
To address the issue, ADL set out to:
- Define high-level requirements, policies and business rules for instructional content repositories that constrain the cope of the architecture so that it is practical to implement it.
- Identify and relate the most relevant technologies and specifications that can be applied to the architecture.
- Define an architecture on which necessary services may be built.
- Provide an architecture that can scale.
CORDRA
Early in the process, ADL partnered with, and funded, the Corporation for National Research Initiatives (CNRI) to survey the state of the art in content collections management and use, and to develop a framework that could address the unique requirements of the ADL communities of practice. CNRI’s decades of involvement with the evolution of the Internet and World Wide Web, combined with ADL’s experience in learning, education and training, has resulted in a new and different approach to the discovery and use of widely distributed content. The partnership led to the development of a framework called Content Object Repository Discovery and Resolution Architecture (CORDRA). The first implementation of this framework to serve the DoD community is the ADL Registry.
The ADL Registry is the first, publicly available CORDRA registry. It was developed as a partnership between ADL and CNRI, with CNRI responsible for much of its development and implementation. The ADL Registry provides a registry of content objects for the DoD and encourages their discovery and re-use by the DoD community and, in some cases, the general public. Below is the snapshot of the ADL Registry website located at: http://adlregistry.adlnet.gov.
Digital Content
Throughout the many ADL communities of practice, within and outside of DoD, new digital content is created every day by hundreds of developers spread over multiple programs, organizations, and missions, that could, in theory, be applied to learning, education, and training by many other users. Every day, the volume of potentially valuable learning resources increases geometrically. Yet, the very existence of these resources is virtually unknown, outside a local and relatively small set of users and developers.
The creation and use of digital content for learning remains largely a local affair, independent of the rest of the networked world. Compared to publishing and library science, the learning-content world has been described as very unorganized and, therefore, very hard to organize on any large scale due to the lack of common practices for creating, storing and describing learning content. As shown in Figure 1, multiple communities of content creators and users exist with multiple content repositories and no means of bridging and reusing amongst, or even within, the communities.
Creating content objects for learning, education and training can be expensive. Managing, describing and making these assets available is also expensive. Discovering and accessing content objects is critical just within the organization that creates them; making those assets visible and available more broadly should, in theory, increase the potential value of investment in content. However, a business “ecosystem” that reinforces such activities has not yet developed.
Successful content discovery and access in learning, education and training communities is difficult because multiple requirements exist that demand different technical, architectural and organizational solutions. Some of these requirements can be met partially or wholly, through known technologies. Others have not been addressed because of the fragmented and specialized nature of learning, education and training communities, and because a great deal of current research and development addresses major business areas of the Internet, such as general searching and social networking.
These special requirements of smaller communities may initially be lost behind the scenes of major Internet markets. Therefore, this is challenging because multiple requirements exist that have not been focused on by the mainstream Internet development community and because many of these requirements are unique to the learning space.
The requirements for ADL communities of practice to effectively discover and re-use content objects include these key qualities:
- System simplicity to register and make visible content objects with minimal impact on developers and managers;
- System ability to provide immediate value to those who register content objects within their own organization, as well as to others outside their organization;
- System ability to accommodate variations in local administrative policies;
- Employment of common means for describing content objects, that can vary according to the subject and organizational domain;
- Means to federate information about content objects from multiple, separate sources is needed for indexing and searching purposes;
- Framework to make information about content objects visible across multiple domains at a large scale is highly desirable; and
- System ability that enables content objects to be precisely referenced from within other objects or services and to have those references remain valid over time, including over changes in object location and ownership.
Infrastructure that can support the development of value-added services to meet local policy and business rules and life cycle management, and that can become the basis for enabling a return on investment in content creation.
None of the ADL requirements are particularly challenging when taken individually; each has a potential or existing technical or organizational solution. Taken together, however, solutions become challenging due to the number and complexity of the requirements.
Few would question the value of Web-searching technologies such as Google and Yahoo. However, these technologies do not address important requirements of the learning, education and training communities. Three key limitations are:
Google-type Web search uses indexed data that is created by “crawling” through the entire World Wide Web to see what is accessible and machine readable. A good deal of digital content used for learning is not accessible to Web crawlers (by policy or lack of infrastructure) or is unreadable because it is in a digital format that Web crawlers can’t easily interpret or the digital object is not available or indexed.
Google-type Web search finds everything and anything that might be relevant; there is no reliable means to filter content for authenticity, validity, currency and other criteria to limit the results to truly relevant learning content.
Mission-critical use cases cannot afford the hit-or-miss nature of today’s “index-everything” search strategies.
Web searching using Google and similar services provides fabulous value on many levels and will no doubt evolve and improve over time, as search algorithms become more sophisticated. But, the solution to finding content that is intentionally created for specific learning objectives, or that has special, well-crafted, instructional or informational value, and that has been vetted, authorized and made available for use by those who really need it, remains a technical and organizational challenge.
Key to addressing this issue is the way in which indexes are built for content objects. As shown in Figure 2, the registry approach creates a master metadata index to allow the discovery of specific objects described by associated metadata. Current search engine technologies do not support this approach.
ADL Registry
ADL addressed this challenge by developing the ADL Registry. The primary goal of the ADL Registry is to provide the means to register, search for and discover content developed by many independent content developers. The registry assumes that content created locally will be stored in some sort of digital repository, but makes no assumptions about how that repository is implemented or administered. Right away, that requires an approach that has minimal impact on how local repository managers do their business. The system design must accommodate a wide variation of implementations, business rules, and work flows.
As shown in Figure 3, the ADL Registry addresses the U.S. DoD learning community of practice. Other CORDRA registries may be implemented to address other communities of practice. The ADL Registry provides a means to centrally gather, or federate information about content objects from multiple sources and then index that information for searching and discovery of the content objects. For this approach to be useful, some consistency must exist in the aggregated data. Specific processes for registering content information are specified. The ADL Registry metadata maps to the DoD Discovery Metadata Specification (DDMS) that supports the department’s goal of increased data visibility and enterprise discovery.
ADL participates in the Global Information Grid Enterprise Services Metadata Working Group, which is responsible for configuration management of the DDMS, and ensures consistency with the department’s Net-Centric Data Strategy Objectives.
Examples
To successfully federate many separate repositories, common metadata approaches must be developed, local business practices reflected, and individual access rights and policies administered. These requirements present a set of ADL community-specific design and technical challenges that must be addressed. However, ADL’s vision and goal are to have designed a system that not only meets the requirements of the ADL communities of practice, but that can be modified and adapted to other communities of practice so that the underlying systems and interfaces are common.
Included in the ADL Registry design and in CORDRA are a set of robust services for creating persistent and unique identifiers that can be used to determine information about the object being identified. This system is as critical to the architecture of the registry as is DNS (Domain Name System) to the Web. This system is called the Handle System and was developed by CNRI. The same system is used by the publishing industry, among others, and has been proposed as a more valuable and useful approach than DNS.
Rather than build an isolated system with brittle features and capabilities, CNRI and ADL engineered CORDRA as an extensible, modular and scalable architecture that anticipates variations in the future by other communities, as well as new services and capabilities not yet imagined.
2008 will be an important year to determine the success of the ADL Registry. The services and DoD components have significant amounts of content ready to be registered into the ADL Registry but are looking for tools and services that will automate the process. Stay tuned as the ADL Registry populates, its content is discovered, shared and reused, and ADL’s vision of a learning object economy is realized. ♦





