U.S. Department of the Interior

 

 

Interior Enterprise Architecture

 

 

 

 

Chapter 3

Data Management Architecture

Version 2.0

 

 

 

 

image 002

 

 

 

October 15, 2003



3.1              Introduction and Background

 

Data is the representation of facts, concepts or instructions in a formalized manner suitable for communication, interpretation or processing.  When data is combined appropriately, information is derived.  Much like the natural resources it manages, Interior’s data and information are valuable assets that must managed.  The full value of data and information resources is realized when Interior is able to appropriately share that data and information internally, as well as with external partners.

 

The focus of the Interior Enterprise Architecture is on providing guidance for information technology (IT) issues and initiatives that are Interior-wide or multi-bureau in scope. The Data Management architecture defines the mechanisms and standards for collecting, documenting, accessing, managing, maintaining the integrity of and securing Interior’s electronic data assets.

 

If used correctly, the Interior Enterprise Architecture will act as a catalyst for those looking to capitalize on its contents and better understand the full meaning of its guidance. This understanding will permit IT personnel to better engage the non-IT organization in discussions around tradeoffs and priorities within the proper governance structure (e.g., Management Initiatives Team (MIT), Information Technology Management Council (ITMC)). The Interior Enterprise Architecture is not intended to be the “last word” (e.g., some automated checklist for product selection). It is intended to be one of the “first words” to assure that Interior’s mission priorities and its IT priorities remain closely aligned.

 

There are many instances within Interior of data sharing and reuse.  Conversely, there are also many examples of where data is not reused and shared enterprise-wide but collected and duplicated in innumerable databases throughout Interior or even within a single Bureau (e.g., names, addresses, and social security numbers may be stored and maintained in every application system that needs that particular data).  It is difficult to determine which database stores the most current or correct information.  Storing and maintaining multiple copies of the same data throughout the enterprise is time consuming and expensive.

 

Because Interior is incorporating the OMB’s Federal Enterprise Architecture (FEA) models, the technical guidance provided by the subject area experts within a domain spans both the Service Component Reference Model (SRM) as well as the Technical Reference Model (TRM). For the Data Management domain, the SRM elements are as follows:

 

Service Domain(s):    The Back Office Services Domain defines the set of capabilities that support the management of enterprise planning and transactional-based functions.

 

Service Type(s):         Data Management - defines the set of capabilities that support the usage, processing and general administration of unstructured information.

 

Development and Integration - defines the set of capabilities that support the communication between hardware/software applications and the activities associated with deployment of software applications.

 

                                   

Component(s):            Data Classification – defines the set of capabilities that allow the classification of data.

 

Meta Data Management – defines the set of capabilities that support the maintenance and administration of data that describes data.

 

Data Cleansing – defines the set of capabilities that support the removal of incorrect or unnecessary characters and data from a data source.

 

Data Exchange – defines the set of capabilities that support the interchange of information between multiple systems or applications.

 

Data Recovery – defines the set of capabilities that support the restoration and stabilization of data sets to a consistent, desired state.

 

Extraction and Transformation – defines the set of capabilities that support the manipulation and change of data.

 

Loading and Archiving – defines the set of capabilities that support the population of a data source with external data.

 

Data Mart – defines the set of capabilities that support a subset of a data warehouse for a single department or function within an organization.

 

Data Warehouse – defines the set of capabilities that support the archiving and storage of large volumes of data.

 

Data Integration - defines the set of capabilities that support the organization of data from separate data sources into a single source using middleware or application integration as well as the modification of system data models to capture new information within a single system.

image 004

These SRM service elements are likewise supported by Interior’s IT (technical) infrastructure (e.g., servers, networks). Within this infrastructure are individual TRM components for which this domain team is providing guidance. The graphic below outlines those TRM elements for this domain that support the service needs of the SRM.

 

Additionally, it’s doubtful that a single domain chapter from the TRM can be used to address a substantive issue.  More realistically, a few architecture domains may need to be reviewed when addressing an important IT decision.  For example, if Interior was considering the creation of a new Interior-wide Web application that could be used both by the general public and Interior personnel, then the TRM chapters like Data Management Technologies, Information Security, Distributed Systems Management and Application Development might all need to be reviewed.

 

3.2       Architectural Principles

 

The principles listed below provide guidance for the design and selection of technology components that will support the data management needs of Interior-wide IT initiatives.

 

Principle 1:      Data Sharing

 

Data and information must be managed to facilitate data sharing across Interior, with our partners and the public.

 

Rationale:

  • Reduces duplication of effort.

 

  • Achieves economies of scale, especially through cooperative data collection efforts.

 

  • Leads to increased data quality.

 

  • Conforms with the Government Paperwork Elimination Act, Clinger-Cohen Act, Paperwork Reduction Act, Electronic Freedom of Information Act Amendments of 1996 and section 508 of the Rehabilitation Act.

 

  • Enhances reusability of data and information.

 

Implications:

  1. Data and information resources will need to be defined in bureau and department information architectures.
  2. Need to agree upon data exchange mechanisms and protocols.
  3. Data that is common among many business applications will be sourced and updated from a single authoritative source.
  4. Need to establish common core data standards, including data definitions.
  5. Need to agree upon and establish a common data standards process.
  6. Need a consistent data management process.
  7. Need well-documented and defined metadata.
  8. Sharing and access needs to be timely.
  9. Additional effort may be required in the presentation of data to meet accessibility requirements.
  10. Data should be made available in a variety of formats suitable for the user.
  11. The value of information is increased when not held in isolated pockets.

12.  Need to balance the desire to share data with sensitivity, privacy and confidentiality restrictions.

13.  Need to take electronic records management requirements into consideration.

 

 

Principle 2:      Data Collection and Reuse

 

In considering data requirements, we should look to reuse existing data before we buy.  If no data exists within Interior, consider acquisition of data from external sources before collecting/creating new data.

 

Rationale:

  • Saves money.

 

  • Leads to increased data quality and integrity.

 

  • Saves time.

 

  • Supports the promotion of standards.

 

  • Supports the Federal Activities Inventory Reform Act, Paperwork Reduction Act and Clinger-Cohen Act. 

 

Implications:

  1. Need a clearinghouse of metadata for existing data.
  2. If you are going to acquire data, consider facilitating its use by all of Interior.
  3. Potential data sources’ data quality must be validated before acquisition or collection of data.
  4. We are at the “supplier’s” mercy for future cost, quality, availability, service and metadata.
  5. Good data requirements are needed to evaluate potential data sources.
  6. Need a standard process for acquiring data, when formal agreements are required.
  7. Data that is common among many business applications will be sourced and updated from a single authoritative source.
  8. When acquiring data from private vendors, licensing restrictions should be considered.

 

 

Principle 3:      Data Security

 

Data needs to be secured according to its sensitivity.

 

Rationale:

  • Complies with the Computer Security Act, the Privacy Act, the Government Information Security Reform Act, Office of Management and Budget (OMB) Circular A-130 Appendix 3, Electronic Freedom of Information Act Amendments of 1996, Computer Matching and Privacy Protection Act  and Section 515 of the Treasury and Consolidated Agency Appropriation Act.

 

  • Enhances public trust.

 

  • Helps safeguard confidential and proprietary information.

 

  • Enhances the proper stewardship over information.

 

  • Enhances the integrity of the information.

 

  • Helps to ensure legal and proper use of information.

 

Implications:

  1. Need to establish data sensitivity and privacy classifications and a review process.
  2. Need to conduct periodic (re)assessments of data classifications.
  3. May require additional resources (e.g., personnel, hardware and software).
  4. Will lead to the use of authentication technologies; for example, digital signatures or passwords.
  5. May need to edit sensitive data that is released to the public.
  6. Employees and contractors will require training regarding use of sensitive data.
  7. Data stewards will require training on this “new” stewardship responsibility.

 

 

Principle 4:      Data Contingency Planning

 

Contingency planning processes need to be in place to ensure data availability.

 

Rationale:

  • Supports Interior Continuity of Operations (COO) plans.

 

  • Ensures continued operations.

 

  • Protects Interior data.

 

  • Complies with Federal Preparedness Circular 65, FEDERAL EXECUTIVE BRANCH CONTINUITY OF OPERATIONS (COOP).

 

·        Allows Interior to continue its mission and meet legal requirements.

 

Implications:

  1. Need to establish data recovery priorities.
  2. Resources must be provided for data recovery testing.
  3. Alternative off-site data archives need to be in place and synchronized.
  4. Need periodic reassessment of bureau/Interior COO plans to ensure data availability is addressed.

 

 

Principle 5:      Data Lifecycle

 

Information is valued as an Interior asset; therefore, Interior data needs to be managed throughout its lifecycle.

 

Rationale:

  • Data has its own lifecycle related to the lifecycle of the mission, not the information system.

 

  • Facilitates data reuse and locating data at each stage of its lifecycle (including historical).

 

  • Managed data improves the ability to accelerate sound decision-making.

 

  • Meets the legal requirements of Paperwork Reduction Act, Government Paperwork Elimination Act, Federal Records Act, Clinger-Cohen Act, the OMB Information Initiative on the National Spatial Data Infrastructure and OMB Circular A-130 regarding data quality (i.e., utility, objectivity and integrity).

 

  • Increases the usefulness and value of data.

 

  • Promotes the wise use of Interior data assets.

Implications:

  1. Interior needs to dedicate resources to data management in addition to relying on the DBA function.
  2. Need ongoing management support and oversight throughout the data lifecycle:  data stewards, data managers and data administrators.
  3. Management of data needs to be tied to workflow of the business process.
  4. Data management (including its on going storage and archiving) is a mission cost that transcends an individual project.
  5. Data quality is everyone’s responsibility.
  6. Integrate data resource planning with business and information technology planning.
  7. Need a consistent data management process.

 

 

Principle 6:      Data Stewardship

 

Data and information must be managed and maintained as a stewardship responsibility to support the mission of the department.

 

Rationale:

  • Data is a resource critical to the mission of Interior.  Like natural and cultural resources, data needs stewards who are responsible for its valuation, preservation, security, access, quality, and utilization.

 

  • Data stewardship promotes common business rules, facilitates information sharing and improves data integrity.

 

·        Data stewardship promotes the establishment of authoritative sources.

 

·        Complies with requirements of Section 515 of the Treasury and Consolidated Agency Appropriation Act.

 

Implications:

  1. Need to develop a data stewardship program that will transcend many organizational boundaries. Need to define data stewardship responsibilities that span the entire data lifecycle.