Reference Guide

AI Glossary & Resource Hub

A comprehensive reference for Australian business leaders navigating AI, data platforms, and regulatory requirements. Clear definitions, local context, no unnecessary jargon.

47+ terms covering core AI and ML concepts, large language models, data architecture, Australian regulations, and enterprise strategy.

Agentic Workflows: A pattern in which AI systems autonomously plan, execute, and adapt multi-step tasks with minimal human intervention. Rather than responding to a single prompt, an AI agent breaks a goal into sub-tasks, calls external tools or APIs, evaluates intermediate results, and iterates until the objective is met.
Australian Context
Australian financial services and government agencies are exploring agentic workflows for compliance monitoring, claims processing, and citizen service automation, with appropriate human-in-the-loop checkpoints to satisfy regulatory expectations.
Related:AI Agents Large Language Model Prompt Engineering
AI Agents: Software systems that use large language models or other AI techniques to perceive their environment, make decisions, and take actions toward a defined goal. Unlike simple chatbots, agents can use tools, access databases, call APIs, and chain multiple reasoning steps together autonomously.
Australian Context
Enterprises across banking, insurance, and government in Australia are piloting AI agents for tasks such as customer onboarding, fraud investigation triage, and internal knowledge retrieval, balancing autonomy with governance requirements.
Related:Agentic Workflows Large Language Model RAG (Retrieval-Augmented Generation)
AI Ethics Framework (Australia): Australia's voluntary AI Ethics Framework, published by the Department of Industry, Science and Resources, outlines eight principles for responsible AI: human, societal and environmental wellbeing; human-centred values; fairness; privacy protection and security; reliability and safety; transparency and explainability; contestability; and accountability.
Australian Context
While currently voluntary, the framework signals the direction of future regulation. Organisations that embed these principles early reduce their risk of non-compliance as mandatory obligations emerge. The framework is widely referenced in government procurement and regulated industries.
Related:Responsible AI Privacy Act 1988 APRA CPS 234
AI Readiness: A measure of how prepared an organisation is to successfully adopt and scale artificial intelligence. AI readiness spans data quality and accessibility, technology infrastructure, talent and skills, governance frameworks, leadership commitment, and organisational culture.
Australian Context
Many Australian enterprises score well on ambition but lag on data foundations and governance. A structured AI readiness assessment helps identify the specific gaps that need closing before AI initiatives can deliver production value.
Related:AI Strategy Data Governance Digital Transformation
AI Strategy: A structured plan that defines how an organisation will adopt AI to achieve its business objectives. A sound AI strategy covers use case prioritisation, data foundations, technology choices, talent development, governance, and a phased roadmap with measurable outcomes.
Australian Context
For Australian enterprises, an effective AI strategy must also account for local regulatory requirements, data sovereignty considerations, and the relatively tight talent market. Starting with high-impact, low-risk use cases builds organisational confidence.
Related:AI Readiness Digital Transformation ROI (AI Context)
APRA CPS 234: Prudential Standard CPS 234 (Information Security) is issued by the Australian Prudential Regulation Authority. It requires APRA-regulated entities (banks, insurers, superannuation funds) to maintain information security capabilities commensurate with the threats they face, including those arising from AI and data platforms.
Australian Context
CPS 234 mandates that boards and senior management maintain oversight of information security, that incidents are reported to APRA within specific timeframes, and that third-party arrangements (including cloud and AI vendors) meet the same security standards. Any AI or data platform deployment in financial services must demonstrate CPS 234 compliance.
Related:Privacy Act 1988 IRAP Data Governance
Artificial Intelligence: The broad field of computer science concerned with building systems capable of performing tasks that typically require human intelligence. This includes learning from data, recognising patterns, understanding language, making decisions, and generating content. Modern AI spans narrow applications (e.g. fraud detection) through to general-purpose systems such as large language models.
Related:Machine Learning Deep Learning Natural Language Processing
Australian Privacy Principles: The thirteen Australian Privacy Principles (APPs) form the cornerstone of the Privacy Act 1988. They govern how organisations collect, use, store, disclose, and provide access to personal information. The APPs apply to any AI system that processes personal data.
Australian Context
AI initiatives must comply with APPs covering consent and notification (APP 5), use and disclosure limitations (APP 6), data quality (APP 10), and security (APP 11). Organisations building AI models on customer data need to ensure their data pipelines, training processes, and inference systems all satisfy APP obligations.
Related:Privacy Act 1988 Notifiable Data Breaches Scheme Data Governance
AutoML: Automated Machine Learning (AutoML) refers to tools and techniques that automate the process of building machine learning models. This includes automated feature engineering, model selection, hyperparameter tuning, and evaluation. AutoML lowers the barrier to entry for ML by reducing the need for deep data science expertise.
Australian Context
AutoML platforms such as those available within Databricks enable Australian enterprises to accelerate their AI programs even with limited data science headcount, a common constraint in the local market.
Related:Machine Learning MLOps Feature Store

Computer Vision: A branch of AI that enables machines to interpret and understand visual information from images and video. Applications include object detection, image classification, facial recognition, optical character recognition (OCR), and quality inspection in manufacturing.
Australian Context
Australian enterprises use computer vision in mining (autonomous vehicles, safety monitoring), agriculture (crop health assessment), retail (inventory management), and healthcare (medical imaging analysis). Privacy Act obligations apply wherever facial recognition or personal imagery is involved.
Related:Deep Learning Neural Network Artificial Intelligence

Data Governance: The set of policies, processes, standards, and metrics that ensure data is managed as a strategic asset. Data governance covers data quality, access control, privacy, lineage, classification, and lifecycle management. Without governance, AI projects cannot move safely from prototype to production.
Australian Context
Australian regulators increasingly expect demonstrable data governance. APRA-regulated entities must show how data used in models and AI systems is controlled. The Privacy Act requires governance over personal information. Unity Catalog on Databricks provides a technical foundation for enterprise data governance.
Related:Data Pipeline Unity Catalog Australian Privacy Principles
Data Lake: A centralised repository that stores large volumes of raw data in its native format, whether structured, semi-structured, or unstructured. Data lakes offer low-cost, scalable storage and are well suited for exploratory analytics and machine learning workloads.
Australian Context
Many Australian enterprises adopted data lakes in the 2010s but found they became ungoverned 'data swamps' without proper metadata management and access controls. The lakehouse architecture addresses these shortcomings.
Related:Data Lakehouse Data Warehouse Data Governance
Data Lakehouse: A modern data architecture that combines the low-cost, flexible storage of a data lake with the performance, reliability, and governance features of a data warehouse. The lakehouse supports both business intelligence and AI/ML workloads on a single platform, eliminating the need for separate systems.
Australian Context
Databricks pioneered the lakehouse pattern using Delta Lake. For Australian enterprises, the lakehouse simplifies compliance by providing a single governed platform rather than multiple disconnected systems, each with its own security and audit configuration.
Related:Data Lake Data Warehouse Databricks Delta Lake
Data Mesh: An organisational and architectural approach that treats data as a product, owned and managed by domain teams rather than a centralised data team. Data mesh principles include domain ownership, data as a product, self-serve data infrastructure, and federated computational governance.
Australian Context
Large Australian banks and insurers are adopting data mesh principles to scale their data capabilities across business units while maintaining consistent governance. Databricks and Unity Catalog support data mesh by enabling domain-level data products with centralised governance policies.
Related:Data Governance Data Lakehouse Unity Catalog
Data Pipeline: An automated series of processes that move and transform data from source systems to a destination such as a data lake, warehouse, or lakehouse. Pipelines handle extraction, validation, transformation, enrichment, and loading of data, ensuring it arrives in the right format for analytics and AI.
Related:ETL/ELT Data Lakehouse Data Governance
Data Warehouse: A centralised repository of structured, cleaned, and organised data optimised for fast analytical queries. Data warehouses use schema-on-write, meaning data must be structured before it is stored. They excel at business intelligence but are typically expensive to scale and poorly suited to unstructured data or ML workloads.
Related:Data Lake Data Lakehouse ETL/ELT
Databricks: A unified data and AI platform built on Apache Spark that provides a lakehouse architecture for analytics, data engineering, data science, and machine learning. Databricks combines a collaborative workspace, managed infrastructure, and governance tools (Unity Catalog) to help enterprises consolidate their data stack and accelerate AI adoption.
Australian Context
Databricks operates on all three major cloud providers in Australian regions (AWS Sydney, Azure Australia East, GCP Sydney), supporting data sovereignty requirements. It is widely adopted by Australian financial services, government, healthcare, and retail organisations.
Related:Data Lakehouse Unity Catalog MLOps
Deep Learning: A subset of machine learning that uses multi-layered neural networks to learn complex patterns from large volumes of data. Deep learning powers many modern AI breakthroughs including image recognition, speech understanding, natural language processing, and generative AI.
Related:Neural Network Machine Learning Transformer
Digital Transformation: The process of fundamentally changing how an organisation operates and delivers value by adopting digital technologies. In the AI era, digital transformation increasingly centres on data modernisation, automation, and embedding intelligence into core business processes.
Australian Context
Australian enterprises across financial services, government, healthcare, and resources are in various stages of digital transformation. Those that have modernised their data foundations are best positioned to capture value from AI.
Related:AI Strategy AI Readiness Data Lakehouse

Embeddings: Numerical vector representations of data (text, images, or other content) that capture semantic meaning in a format AI models can process. Similar items produce similar embeddings, enabling tasks such as semantic search, recommendations, and clustering without explicit rules.
Related:Vector Database Large Language Model RAG (Retrieval-Augmented Generation)
ETL/ELT: ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) are two approaches to moving data into analytical systems. ETL transforms data before loading it into the destination; ELT loads raw data first and transforms it in place. ELT has become the dominant pattern in cloud and lakehouse architectures because it preserves raw data and leverages the processing power of the destination platform.
Related:Data Pipeline Data Lakehouse Data Warehouse

Feature Store: A centralised repository for storing, managing, and serving the engineered features (input variables) used by machine learning models. Feature stores ensure consistency between training and production environments, reduce duplicated effort, and improve model reliability.
Australian Context
Databricks provides a built-in Feature Store that integrates with Unity Catalog for governance. Australian enterprises use feature stores to ensure that the same feature definitions are used across development and production, reducing model drift and compliance risk.
Related:MLOps Machine Learning Data Pipeline
Fine-tuning: The process of taking a pre-trained AI model and further training it on a smaller, domain-specific dataset to improve its performance on particular tasks. Fine-tuning allows organisations to adapt general-purpose models to their industry terminology, internal processes, and specific use cases without training a model from scratch.
Australian Context
Australian enterprises fine-tune models for tasks like understanding local regulatory language, processing Australian English, and adapting to industry-specific terminology in sectors such as mining, agriculture, and financial services.
Related:Large Language Model Transfer Learning RAG (Retrieval-Augmented Generation)

Hallucination: A phenomenon where an AI model generates output that sounds plausible but is factually incorrect, fabricated, or inconsistent with its source material. Hallucinations are a well-known limitation of large language models and represent a significant risk in enterprise applications where accuracy is critical.
Australian Context
For Australian organisations in regulated industries, hallucinations in AI-generated advice, reports, or customer communications can create compliance and reputational risk. Mitigation strategies include RAG architectures, human-in-the-loop review, and rigorous evaluation frameworks.
Related:Large Language Model RAG (Retrieval-Augmented Generation)Responsible AI

IRAP: The Information Security Registered Assessors Program (IRAP) is an Australian Signals Directorate (ASD) initiative that provides a framework for assessing the security of systems against the Australian Government Information Security Manual (ISM). IRAP assessments are required for cloud services used by Australian government agencies.
Australian Context
Cloud platforms hosting AI workloads for government must hold current IRAP assessments. AWS, Azure, and GCP all maintain IRAP-assessed environments in Australian regions. Databricks deployments on these clouds inherit the underlying IRAP posture, but organisations must still ensure their own configurations meet ISM controls.
Related:APRA CPS 234 Data Governance Privacy Act 1988

Large Language Model: A type of AI model trained on vast quantities of text data that can understand, generate, summarise, and reason about natural language. LLMs such as GPT-4, Claude, and Llama power applications ranging from conversational assistants and content generation through to code writing and complex analysis.
Australian Context
Australian enterprises are deploying LLMs for customer service automation, document processing, regulatory compliance assistance, and internal knowledge management. Data sovereignty, privacy, and responsible use are key considerations when selecting and deploying LLMs.
Related:Transformer Fine-tuning RAG (Retrieval-Augmented Generation)Hallucination

Machine Learning: A subset of artificial intelligence in which systems learn patterns and make predictions or decisions from data without being explicitly programmed for each scenario. Machine learning encompasses supervised learning (from labelled examples), unsupervised learning (finding hidden patterns), and reinforcement learning (learning through feedback).
Related:Artificial Intelligence Deep Learning Neural Network
Minimum Viable Product (AI): The simplest version of an AI solution that can be deployed to validate a use case and deliver initial business value. An AI MVP focuses on a narrow, well-defined problem, uses available data, and prioritises learning and iteration over feature completeness.
Australian Context
Australian organisations that take an MVP approach to AI see faster time to value and higher success rates. A typical AI MVP might take 6 to 12 weeks, with subsequent iterations expanding scope based on real-world performance and user feedback.
Related:Proof of Concept AI Strategy ROI (AI Context)
MLOps: Machine Learning Operations (MLOps) is the discipline of deploying, monitoring, and managing machine learning models in production. MLOps borrows principles from DevOps and applies them to the ML lifecycle, covering experiment tracking, model versioning, automated testing, deployment pipelines, and performance monitoring.
Australian Context
Databricks provides MLOps capabilities through MLflow, including experiment tracking, a model registry, and model serving. For Australian regulated industries, MLOps practices provide the audit trails and reproducibility required by supervisors such as APRA.
Related:ModelOps Machine Learning Feature Store
ModelOps: A broader discipline than MLOps that encompasses the operationalisation of all types of AI and analytical models, including ML models, rules-based models, and decision models. ModelOps focuses on governance, lifecycle management, and performance monitoring across the full model portfolio.
Related:MLOps AI Agents Data Governance
My Health Records Act: The My Health Records Act 2012 governs the Australian digital health record system and sets strict rules around the collection, use, and disclosure of health information. Any AI system that accesses or processes My Health Record data must comply with the Act's specific privacy and security requirements.
Australian Context
Healthcare organisations building AI solutions in Australia must understand the interplay between the My Health Records Act, the Privacy Act, and state-level health records legislation. Penalties for misuse of My Health Record data are severe and include criminal sanctions.
Related:Privacy Act 1988 Australian Privacy Principles Data Governance

Natural Language Processing: A branch of AI focused on enabling computers to understand, interpret, and generate human language. NLP underpins applications such as chatbots, sentiment analysis, document summarisation, translation, and search. Modern NLP is largely powered by transformer-based large language models.
Related:Large Language Model Transformer Artificial Intelligence
Neural Network: A computing architecture inspired by the structure of the human brain, consisting of layers of interconnected nodes (neurons) that process information. Neural networks learn by adjusting the strength of connections between nodes based on training data. They are the foundation of deep learning and modern AI.
Related:Deep Learning Machine Learning Transformer
Notifiable Data Breaches Scheme: The Notifiable Data Breaches (NDB) scheme, part of the Privacy Act 1988, requires organisations to notify affected individuals and the Office of the Australian Information Commissioner (OAIC) when a data breach is likely to result in serious harm. The scheme applies to any breach involving personal information, including breaches originating from AI systems.
Australian Context
AI systems that process personal data must have breach detection, containment, and notification processes in place. The NDB scheme has a strict 30-day assessment period once an organisation becomes aware of a suspected breach. Failure to comply can result in significant penalties.
Related:Privacy Act 1988 Australian Privacy Principles Data Governance

Privacy Act 1988: Australia's primary federal privacy legislation, governing how organisations and government agencies handle personal information. The Act includes the thirteen Australian Privacy Principles (APPs) and applies to any AI system that collects, stores, uses, or discloses personal data.
Australian Context
The Privacy Act is under active review with proposed reforms that will strengthen obligations around automated decision-making, algorithmic transparency, and individual rights. Organisations building AI systems should design for the direction of reform, not just current requirements.
Related:Australian Privacy Principles Notifiable Data Breaches Scheme AI Ethics Framework (Australia)
Prompt Engineering: The practice of designing and refining the instructions (prompts) given to large language models to elicit accurate, useful, and consistent outputs. Effective prompt engineering involves clear task specification, providing context and examples, setting constraints, and iterative testing.
Related:Large Language Model RAG (Retrieval-Augmented Generation)Hallucination
Proof of Concept: A small-scale, time-boxed project designed to demonstrate the feasibility and potential value of an AI use case before committing to full implementation. A well-structured PoC defines clear success criteria, uses representative data, and produces measurable results that inform the business case for scaling.
Australian Context
Australian enterprises typically run AI PoCs over 4 to 8 weeks. The most successful PoCs focus on a single, well-defined business problem with available data and an engaged business sponsor.
Related:Minimum Viable Product (AI)AI Strategy ROI (AI Context)

RAG (Retrieval-Augmented Generation): An AI architecture that enhances large language model responses by first retrieving relevant information from an external knowledge base, then using that information to generate grounded, accurate answers. RAG reduces hallucinations and enables AI systems to work with current, organisation-specific data without retraining the underlying model.
Australian Context
RAG is widely adopted by Australian enterprises for internal knowledge management, compliance question answering, customer service, and policy lookup. Databricks supports RAG through Vector Search, Model Serving, and integration with frameworks such as LangChain.
Related:Large Language Model Vector Database Embeddings Hallucination
Reinforcement Learning: A type of machine learning where an agent learns to make decisions by taking actions in an environment and receiving rewards or penalties based on outcomes. The agent improves its strategy over time by maximising cumulative reward. Reinforcement learning powers applications such as robotics, game playing, and recommendation optimisation.
Related:Machine Learning AI Agents Deep Learning
Responsible AI: An approach to developing and deploying AI systems that prioritises fairness, transparency, accountability, privacy, and safety. Responsible AI practices include bias testing, explainability, human oversight, impact assessments, and ongoing monitoring of AI system behaviour.
Australian Context
Australia's AI Ethics Framework provides the national reference point for responsible AI. Increasingly, Australian regulators, procurement bodies, and customers expect organisations to demonstrate responsible AI practices, making it a business imperative as well as an ethical one.
Related:AI Ethics Framework (Australia)Hallucination Data Governance
ROI (AI Context): Return on Investment for AI initiatives, measured across direct cost savings, revenue uplift, risk reduction, efficiency gains, and strategic value creation. AI ROI often follows a J-curve pattern: upfront investment in data foundations and infrastructure precedes accelerating returns as AI capabilities scale across the organisation.
Australian Context
Australian enterprises typically see positive AI ROI within 12 to 18 months of a well-structured program. Measuring ROI requires establishing clear baselines before implementation and tracking both quantitative metrics and qualitative benefits such as faster decision-making and improved customer experience.
Related:AI Strategy Total Cost of Ownership Proof of Concept

Tokens: The basic units of text that large language models process. A token can be a word, part of a word, or a punctuation mark. Tokenisation varies by model, but as a rough guide, one token is approximately three-quarters of a word in English. Token counts determine model context windows (how much text can be processed at once) and directly affect compute costs.
Related:Large Language Model Prompt Engineering Transformer
Total Cost of Ownership: The complete cost of implementing and operating an AI or data platform over its lifecycle, including licensing, infrastructure, implementation services, training, ongoing operations, and opportunity costs. TCO analysis helps organisations make informed platform decisions and avoid hidden costs that emerge after deployment.
Related:ROI (AI Context)AI Strategy Databricks
Transfer Learning: A machine learning technique where a model trained on one task is reused as the starting point for a different but related task. Transfer learning dramatically reduces the data and compute required to build effective models, because the pre-trained model has already learned general patterns that transfer to new domains.
Related:Fine-tuning Deep Learning Large Language Model
Transformer: A neural network architecture that uses a mechanism called self-attention to process input data in parallel rather than sequentially. Transformers are the foundation of modern large language models and have revolutionised natural language processing, computer vision, and generative AI since their introduction in 2017.
Related:Large Language Model Deep Learning Neural Network

Unity Catalog: Databricks' unified governance solution for data and AI assets. Unity Catalog provides centralised access control, audit logging, data lineage, data discovery, and fine-grained security (row-level, column-level, and dynamic data masking) across all Databricks workspaces and cloud environments.
Australian Context
Unity Catalog is critical for Australian enterprises that need to demonstrate data governance to regulators. It provides the audit trails required for APRA compliance, the access controls mandated by the Privacy Act, and the lineage tracking expected by data governance frameworks.
Related:Databricks Data Governance Data Lakehouse

Vector Database: A specialised database designed to store, index, and query high-dimensional vector embeddings efficiently. Vector databases enable similarity search at scale, powering applications such as semantic search, recommendation engines, and RAG-based AI systems.
Australian Context
Databricks provides built-in Vector Search capabilities, eliminating the need for a separate vector database. Australian enterprises use vector databases to power internal search over policy documents, customer service knowledge bases, and product catalogues.
Related:Embeddings RAG (Retrieval-Augmented Generation)Large Language Model