Feature

Breaking Down Big Data: The Value in Metadata

6 minute read
John Horodyski avatar
SAVED

I never met a data that I didn’t like.” -- Internet Meme

As a Partner at Optimity Advisors, my role is to work with clients to make data likeable: identifiable, discoverable, usable and ultimately, valuable. Companies are struggling to manage big data in a landscape of rapidly increasing production and diverse formats. The ability to collect and analyze internal and external data can dictate how well an organization will generate knowledge, and ultimately value. How can you start planning for this value?

Gary Drenik argues in his recent Forbes article,

How to beat the big data giant? Start by thinking little data, as in David vs. Goliath. The first step in the little data process is to identify key business objectives that your organization would like to have data solve. Objectives need to be clearly defined … once defined, objectives serve as a roadmap for the identification of relevant data sourcing to generate new insights for evidence-based decision making by executive team members.”

Metadata is the best way to identify little data that becomes big data. Little data provides structure to what becomes big data. Invest the time, energy and resources to identify, define and organize your assets for discovery and increase their value.

What’s the Big Deal About Big Data?

Big data refers to data sets that are too large and complex to manipulate or interrogate with standard methods or tools. Information -- and all its data and digital assets -- has become more available, accessible and in some ways, more accountable in business. To better understand the big deal about big data, start with an understanding of the associated terms, which include:

  • Metadata simply stated is information that describes other data; essentially, data about data. It is the descriptive, administrative and structural data that defines assets.
  • Taxonomy is the classification of information into groups or classes that share similar characteristics. It provides the consistency and control in language that can power the single source of truth as expressed in a DAM or CMS and is a key enabler for organizing any large body of content.
  • A Controlled vocabulary is a set of defined terms that populate a drop-down or pick list. Establishing “preferred terms” is a good way to provide control, authority and consistency to your digital assets. You not only need to know what it is you are describing but how it may best be described.
  • Structured data refers to information with a good level of control and organization, for example, a “date” value in an “Expiration Date” field. Structured data is usually found in a controlled data environment with inherent meaning and purpose.
  • Unstructured data lacks that control and meaning, offers a confused sense of purpose and requires analysis or interpretation to restore meaning. Using the example above, if a “date” is discovered with no “field” in which to provide that control and structure, what does that tell you? Wrangling all that data will create a more structured sense of purpose for the content in your organization. It makes information more relevant, palpable, understandable and useable.
  • Master Data is business critical data that is governed and shared across multiple systems, applications or departments within an organization. Master Data can be identifiers, attributes, relationships, reference data and yes, metadata!
  • Master Data Management (MDM) is the set of processes, tools and governance standards/policies that consistently define, manage and distribute Master Data. Everything starts with data modeling, and data modeling is inherently tied to metadata (ISO-IEC 11179).

The value that metadata, or little data, brings to big data is in the structure and meaning it provides. It serves asset discovery by identifying assets and allowing them to be found by relevant criteria. Metadata also brings similar assets together and distinguishes dissimilar assets. Value is added by managing data.

Big Data Challenges - 'I Still Haven’t Found What I’m Looking For'

We have an unprecedented wealth of data at our discretion and under considerable watch and scrutiny from creators, users and stakeholders. Organizations need to change accordingly to respond and create new solutions. The challenge with big data is how to manage it. That includes everything from:

  • identification
  • capture
  • curation
  • storage
  • search
  • sharing
  • analysis

Big data will only continue to grow. The emergence of the Internet of Things and new platforms will produce more information and data in locations both within and external to your business. Increasing an understanding of data and repositories will protect the organization from:

Learning Opportunities

  • Savvy plaintiffs will request big data in discovery
  • Inadvertent data ingestion = data breaches
  • Consumers will lose confidence in data protection upon realization that personal data is everywhere

There has never been a more important time to make data a priority in your strategic planning.

Best Practices

For managing metadata and digital assets in business:

  • Metadata management and planning for new process or systems
  • Clear ownership and absent documentation with digital assets
  • Current documentation on metadata or controlled vocabulary

Determining the right questions to ask about best practices will establish if these practices are in place. While enterprise solution providers have not delivered on many tasks that could be automated, new platforms provide great opportunities for communication/engagement/risk management. Additionally, social media and a variety of other social collaboration tools will affect the workplace, blurring the boundaries of how and when business is conducted. Data sharing and collaboration will play an important part in this growth.

The ability to collect and analyze internal and external data can dictate how well an organization will generate knowledge and ultimately value. How can you start planning for this value? A few things to start working towards include:

1. Data Assessment & Organization Planning

  • Inventory and discover data life cycles and users
  • Creating a well-planned data warehouse model ensures valuable enterprise-wide information and metrics, as well as good performance and provisions for growth

2. Capability Assessment & Gap Analysis

  • Produce maps linking data to business processes and validate
  • Determine areas where redundant, obsolete or transient data reduction may be happening. You need to ensure that data in transition is handled accurately and implemented quickly to meet the speed of your business.

3. Modeling & Analysis

  • Develop plans to facilitate analysis and future action for an operational system or creating visualizations and reports for your teams and interest groups, formatted as they need them and delivered when they need them.

Data must be delivered consistently, with standard definitions, and provide the ability to reconcile data models from various systems or data marts.

Making Value 

The struggle to manage information within the big data landscape is as complex as the digital workflows it supports. This landscape includes the internal ecosystem and the wider geography of partners and third-party entities. The complexity of all of the available data is compounded with the increasing rate of production and diversity of formats.

Assets are critical to your business operations -- they need to be discovered at all points of the digital lifecycle. Key to building trust in your data is ensuring its accuracy and usability. Leveraging meaningful metadata provides your best chance for a return on investment on the assets created and becomes an essential line of defense against lost opportunities. Your users' digital experience is based on their ability to identify, discover and experience your brand in the way it was intended. Value is not found -- it's made -- so make the data meaningful to you, your users and your organization by managing it well.

About the Author

John Horodyski

John Horodyski is a Executive Director with Salt Flats for the Insights & Analytics practice with executive management strategy experience in Digital Asset Management (DAM), Metadata and Taxonomy design, Data strategy, Analytics, Governance, MarTech, and Marketing Operations. John is a world leading expert and has provided strategic direction and consulting for a variety of Fortune 10, 50, 100, and 500 clients from Consumer Packaging Goods, to Media & Entertainment, the Pharmaceutical industry, and Insurance. Connect with John Horodyski: