Definition: A column or set of columns in a table that uniquely identifies each row. Primary keys are critical for maintaining data integrity and are highlighted in Structured’s data documentation. Example: The “Customer ID” in the “Customers” table might be the primary key used to uniquely identify each customer in the database.
Glossary
This glossary provides comprehensive definitions for key technical and business-related terms used in Structured. Understanding these terms will help you navigate the platform more effectively and collaborate easily with both technical and non-technical teams. Each term is explained in the context of how it is used within Structured to manage data, track metrics, and streamline workflows.
Definition: Quantifiable measures that track performance or goals in a business, tied to specific data in Structured. Metrics can be defined and automatically updated as the underlying data changes. Example: “Customer Lifetime Value” could be a metric that calculates total revenue per customer using data from multiple tables.
Definition: Data that provides information about other data, such as descriptions, data types, and relationships. In Structured, metadata describes tables, columns, relationships, and models. Example: Metadata for the “Orders” table includes column names, data types (e.g., “Order ID” as an integer), and relationships (e.g., a foreign key linking to the “Customers” table).
Definition: Descriptions and documentation of data sources, tables, columns, and models in Structured. These definitions include metadata that helps explain how data is structured and used. Example: The data definition for a “Sales” table would include each column’s purpose, data type, and any relationships to other tables.
Definition: An abstraction layer that turns raw data into meaningful business objects and metrics, enabling non-technical users to access and understand complex data structures without needing to know SQL or programming languages. Example: A user might ask the semantic layer for customer data without worrying about which specific tables or joins are needed to retrieve the information.
Definition: A system or database from which Structured pulls data. Common data sources include BigQuery, Postgres, Snowflake, Redshift, and others. Example: Connecting a Postgres database as a data source allows Structured to pull in all of the tables, models, and metadata from that database.
Definition: The process by which Structured imports data and metadata from connected data sources. Syncing ensures that Structured always has the latest data definitions and updates from your data sources. Example: After making changes to the schema in a Snowflake database, a sync in Structured updates the metadata and definitions to reflect those changes.
Definition: The flow of data from its origin through transformations to its final destination, such as reports or dashboards. Data lineage in Structured tracks how data flows between tables, business objects, and metrics. Example: Data lineage for “Sales Revenue” might show that it originates in transaction logs, flows through the “Orders” table, and is aggregated into the “Revenue” metric.
Definition: The structured representation of how data is organized, including tables, columns, and relationships. Data models define how data is stored and related to each other in a system. Example: A “Customer” data model might define the relationships between the “Customer”, “Orders,” and “Products” tables, specifying how customers are related to orders and products.
Definition: An automation tool within Structured that helps users make data requests, manage tickets, and automate workflows directly in Slack or the Structured platform. StructuredBot can also comment on GitHub pull requests with insights about data model changes. Example: You can ask StructuredBot in Slack to pull the schema for the “Customers” table or create a ticket for a data model update.
Definition: Notifications triggered when specific conditions occur in your data environment, such as schema changes or anomalies in metrics. Alerts help users quickly identify and respond to issues. Example: An alert might be triggered if a critical column, like “Customer ID,” is accidentally deleted from the “Orders” table, affecting downstream reports.
Definition: A significant change to a data model or schema that disrupts downstream processes, reports, or metrics. Breaking changes typically require immediate attention to avoid data inconsistency. Example: Renaming a primary key column in the “Customers” table without updating related dependencies could result in a breaking change.
Definition: Structured allows users to create and manage tickets for tasks related to data changes, such as schema updates or fixing metric discrepancies. Tickets ensure that requests are tracked and addressed systematically. Example: If a new column needs to be added to a table, a ticket can be generated in Structured to ensure the request is properly reviewed and implemented.
Definition: The practice of tracking and managing changes to data models, definitions, and metrics. In Structured, version control is often managed through integrations with systems like GitHub, where pull requests track changes. Example: Version control allows a team to revert to a previous version of a data model if a recent change causes problems downstream.
Definition: A popular data transformation tool that enables data engineers to define and manage data models through code. Structured integrates with dbt to automatically document models and track changes over time. Example: A team using dbt might define a transformation model for “Sales Revenue,” which Structured then documents and tracks in the platform.
Definition: The process of extracting data from source systems, transforming it into a usable format, and loading it into a destination system (such as a data warehouse). While Structured focuses on documentation and metadata management, it often works with data that has undergone ETL processes. Example: Data is extracted from an e-commerce platform, transformed into a usable format for analysis, and then loaded into a Snowflake data warehouse where Structured syncs it.
Definition: A series of data processing steps that move data from one system to another, often through transformation and cleaning stages. Structured integrates with data pipelines to document and track the flow of data through your system. Example: A pipeline might move raw transaction data from an application into a reporting system, where Structured tracks and documents each step along the way.
Definition: The structure of a database, including tables, columns, and their relationships. In Structured, schemas are documented to provide clarity on how data is organized. Example: The schema for a “Sales” database might include tables for “Orders,” “Customers,” and “Products,” each with defined columns like “Order ID” and “Product Name.”
Definition: A column or group of columns in a database that establishes a link between data in two tables. In Structured, foreign keys are documented in the data lineage to show relationships between tables. Example: In the “Orders” table, the “Customer ID” might be a foreign key linking to the “Customers” table.
Definition: A centralized repository used to store large amounts of structured data for analysis and reporting. Common data warehouses include BigQuery, Snowflake, and Redshift, which are supported data sources in Structured. Example: A company might use Snowflake as its data warehouse, where all transactional and customer data is stored and documented in Structured.
Definition: The process of converting raw data into a more usable format for analysis, reporting, or integration. Data transformations are often managed using tools like dbt and documented in Structured. Example: Transforming raw sales data by calculating total revenue and generating metrics like Average Order Value (AOV).
Definition: An attribute is a property or characteristic of an entity in a data model. In databases, attributes correspond to columns in tables. Example: For the “Customer” entity, attributes could include “Customer ID,” “Customer Name,” and “Email Address.”
Definition: A connection between two or more entities in a data model that describes how data is related across tables. Relationships often involve foreign keys that link rows in one table to rows in another. Example: The “Customer” entity might have a relationship with the “Order” entity, indicating that customers place orders.
Definition: A high-level data model that defines the structure of the data in terms of business concepts and entities, without getting into the technical implementation details. It focuses on what data is represented rather than how it is stored. Example: A conceptual model might show that a “Customer” entity has a relationship with an “Order” entity, without specifying the database schema.
Definition: A more detailed data model that defines the structure of data, including entities, attributes, and relationships, but still remains independent of any physical database technology. Example: A logical model might define the specific attributes of “Customer” and “Order” entities, along with their data types and relationships, but without detailing how the data is stored in a particular database system.
Definition: A process in database design aimed at reducing redundancy and dependency by organizing data into separate tables based on relationships. Normalization is used to ensure that data is stored efficiently and consistently. Example: In a normalized database, customer information would be stored in a separate “Customers” table, rather than being repeated in every “Orders” record.
Definition: The process of combining tables in a database to improve read performance at the expense of write performance and storage efficiency. Denormalization is used when faster querying of data is prioritized over reducing redundancy. Example: In a denormalized database, customer information might be stored directly in the “Orders” table to avoid having to join tables during queries.
Definition: The accuracy, consistency, and reliability of data throughout its lifecycle. In data dictionaries, integrity rules are often included to define acceptable values and relationships between data elements. Example: Data integrity rules might specify that the “Customer ID” in the “Orders” table must always match an existing record in the “Customers” table.
Definition: A comprehensive inventory of available data assets, including metadata, definitions, and lineage. A data catalog helps users find and understand the data they need for analysis or reporting. Example: A data catalog might list all the available tables in a database, including their definitions, descriptions, and the relationships between them.
Definition: A category of data processing that enables users to query and analyze data in a multidimensional format, often used in data warehouses for complex queries and reporting. Example: OLAP allows users to quickly analyze sales data by region, time period, and product category.
Definition: The processes and policies that ensure the proper management, quality, and security of an organization’s data. Data governance helps maintain data integrity and compliance with regulatory requirements. Example: A data governance policy might define who can access sensitive customer data and how that data should be encrypted.
Definition: A hierarchical classification system used to categorize data elements based on their relationships and characteristics. Taxonomies help organize data for easier discovery and use. Example: A company might develop a taxonomy that classifies products by category, subcategory, and type in their product catalog.
Definition: A central table in a star schema of a data warehouse that stores quantitative data for analysis. Fact tables typically contain foreign keys to dimension tables. Example: A “Sales” fact table might store sales transaction data, with foreign keys linking to “Product” and “Customer” dimension tables.
Definition: A table in a data warehouse schema that stores descriptive attributes related to the facts in the fact table. Dimension tables provide context for analyzing quantitative data. Example: A “Product” dimension table might store details like product name, category, and brand, which can be used to analyze sales data in the fact table.
Definition: A database schema design that organizes data into a central fact table and several dimension tables, which resemble a star when diagrammed. This schema is common in data warehouses and supports efficient querying. Example: In a retail data warehouse, the star schema might consist of a central “Sales” fact table with dimension tables for “Customer,” “Product,” and “Store.”
Was this page helpful?