A bewildering array of new terms has accompanied the rise of artificial intelligence, a technology that aims to mimic human thinking.
This glossary provides a guide to some of the most important concepts and terms behind AI to help demystify one of the most impactful technology revolutions in our lifetime.
In addition, selected business terms used in the downloadable whitepapers are included.
AI models, or Artificial Intelligence models, are computational structures designed to learn from data. They are used to make predictions or decisions without being explicitly programmed to perform the task. These models are built using algorithms and trained using data.
AI models can be categorized into several types, including supervised learning models, unsupervised learning models, semi-supervised learning models, and reinforcement learning models.
An algorithm is a step-by-step procedure or a set of rules to be followed in calculations or other problem-solving operations, especially by a computer. It's a detailed series of instructions for carrying out an operation or solving a problem.
In a computing context, algorithms are essential because they define the specific steps that a computer program needs to take to carry out a specific task. Whether it's sorting data, making calculations, or anything else, algorithms are what programs use to get things done.
An application programming interface (API) is a way to programmatically access other pieces of software or data sets. This includes a set of rules and protocols for building and interacting with software applications. It defines the methods and data formats that a program can use to communicate with other programs or components.
APIs allow different software systems to interact with each other, enabling them to share data and functionalities. They act as a bridge between different software applications, allowing them to work together.
Several example types follow:
APIs are essential in modern software development, enabling the creation of more complex and feature-rich applications by allowing different software components to work together seamlessly.
Artificial general intelligence, also known as (AGI), refers to a type of Artificial Intelligence that can understand, learn, adapt, and implement knowledge in a broad range of tasks at a level equal to or beyond a human being.
Unlike Narrow AI, which is designed to perform a specific task, such as voice recognition, AGI can theoretically perform any intellectual task that a human being can do. It can understand, interpret, and respond to its environment in a way that's indistinguishable from a human.
A simple comparison follows:
It's important to note that as of now, AGI remains largely theoretical. While we have many examples of Narrow AI, we have yet to create a system that exhibits AGI.
Artificial intelligence (AI) is the ability of software to perform tasks that traditionally require human intelligence. AI, or Artificial Intelligence, refers to the simulation of human intelligence processes by machines, especially computer systems.
These processes include learning (the acquisition of information and rules for using the information), reasoning (using the rules to reach approximate or definite conclusions), and self-correction.
AI can be categorized into two types:
AI technologies include machine learning (where a computer system is fed large amounts of data, which it then uses to learn how to carry out a specific task), neural networks, and natural language processing.
AI has a wide range of applications, from voice assistants like Siri and Alexa, to recommendation systems used by Netflix and Amazon, to autonomous vehicles and more.
The term "Big Four" refers to the four largest international professional services networks, offering cyber security, audit, tax, consulting, advisory, corporate finance, and legal services.
The Big Four firms are:
A chatbot is a software application designed to simulate human conversation. It interacts with users through messaging platforms, websites, mobile apps, or through voice command interfaces.
Chatbots can be rule-based or powered by artificial intelligence. Rule-based chatbots can only respond to specific commands, while AI-powered chatbots use machine learning and natural language processing to understand and respond to a wider range of inputs in a more conversational manner. The two types of chatbots are described below.
Chatbots are used in a variety of applications, including customer service, information retrieval, and even in therapeutic contexts. They can provide 24/7 support, answer frequently asked questions, and help guide users through websites or applications.
A cognitive bias is a systematic error in thinking that affects the decisions and judgments that people make. It's a kind of mental shortcut, often based on individual perceptions and past experiences, which can lead to distortions in how we perceive reality.
Cognitive biases can lead to perceptual blindness, inaccurate judgments, illogical interpretations, or what is broadly called irrationality. They are often a result of our brain's attempt to simplify information processing.
Some common examples of cognitive biases include:
Data repositories are centralized places where data is stored and maintained. A repository can be a place where multiple databases or files are located for distribution over a network, or a simple place where data is deposited for safekeeping.
Data repositories are often used to store various types of data such as raw data, curated data, metadata, and relational databases. They can be physical or virtual, and can be used for data backup, archiving, and data sharing purposes.
Deep Learning is a subset of machine learning, which is essentially a neural network with three or more layers. These neural networks attempt to simulate the behaviour of the human brain—albeit far from matching its ability—to "learn" from large amounts of data.
While a neural network with a single layer can still make approximate predictions, additional hidden layers can help optimize and refine for accuracy. Three layers are described below.
Deep learning drives many artificial intelligence (AI) applications and services that improve automation, performing tasks such as image and speech recognition, and natural language processing. They're able to recognize patterns with extreme accuracy, given enough data.
Encryption of data in motion, also known as data in transit, refers to the process of protecting data while it is being transferred from one location to another. This could be across the internet or through a private network.
The goal of encrypting data in motion is to ensure that the data, if intercepted during transmission, cannot be read, or understood by anyone who is not the intended recipient. This is typically achieved using various encryption protocols such as Secure Sockets Layer (SSL), Transport Layer Security (TLS), or Internet Protocol Security (IPSec). Further details follow.
Encrypting data in motion is a critical component of data security.
Encryption of data at rest refers to the process of protecting inactive data stored physically in any digital form, including databases, data warehouses, spreadsheets, archives, tapes, off-site backups, mobile devices, or in the cloud.
The goal of encrypting data at rest is to ensure that sensitive data is not accessible without proper authorization, even if the storage medium or device is stolen or compromised. This is typically achieved using various encryption methods such as Advanced Encryption Standard (AES), RSA, or Twofish. Further details follow.
Encrypting data at rest is a critical component of data security.
The endowment effect, in behavioural economics, refers to an emotional bias where people value a good or service more once their property right to it has been established.
In other words, people tend to place a higher value on objects they own than objects that they do not. This is sometimes also referred to as "divestiture aversion".
The endowment effect was first theorized by Richard Thaler, who suggested that people value things more highly as soon as they own them. This can lead to decision-making that is not in the individual's best economic interest.
For example, if you buy a concert ticket for $50, but on the day of the concert, you wouldn't be willing to pay more than $30 for the same ticket, you're still likely to go to the concert because you own the ticket and therefore value it more.
This effect has significant implications in areas such as market prices and the concept of a "fair" price.
Foundation models (FMs) are deep learning models trained on vast quantities of structured and unstructured, unlabelled data. Foundation models can be used for a wide range of tasks as is or adapted to specific tasks with fine-tuning.
Examples of these models include GPT-4, DALL-2 and Stable Diffusion. Foundation models are a class of AI models that are pre-trained on a broad range of internet text and can be fine-tuned for specific tasks.
They are called "foundation" models because they serve as a base upon which a wide range of downstream models and applications can be built.
These models, such as GPT-4 by OpenAI, have demonstrated impressive performance on a variety of tasks, including translation, question-answering, and text generation, often achieving state-of-the-art results. Further detail follow.
While foundation models have shown great promise, they also raise important questions and challenges related to their deployment, including issues of fairness, interpretability, robustness, and their economic and societal impacts.
Fine-tuning in AI refers to the process of taking a pre-trained model (a model that has been trained on a large-scale dataset) and adapting it to a specific task. This is done by continuing the training process on a smaller, task-specific dataset, and adjusting the model's parameters to optimize its performance on the new task.
The idea behind fine-tuning is that the pre-trained model has already learned a lot of useful, general-purpose features from the large-scale dataset, and only minor modifications are needed to adapt it to the specific task.
Fine-tuning is a common practice in many areas of AI, including computer vision and natural language processing. For example, in natural language processing, models like GPT-4 are often fine-tuned on a specific task, such as sentiment analysis or question answering, to achieve state-of-the-art performance.
A force multiplier in business refers to a factor or a combination of factors that gives personnel, a team, or a system the ability to accomplish greater feats than without it.
It's a tool or capability that significantly increases the potential output or effectiveness, such as AI which automates certain repetitive tasks.
In the context of business, a force multiplier could be a piece of technology, a new process, or a strategic partnership that dramatically increases productivity or growth.
Generative AI is AI that is typically built using foundation models and has the capabilities that earlier AI did not have, such as the ability to generate content.
Foundation models can also be used for nongenerative purposes, such as classifying user sentiment as negative or positive, based on call transcripts. Generative AI offer significant improvement over earlier model use cases.
Generative AI is a subset of artificial intelligence that focuses on creating new content. It's a type of machine learning that allows computers to generate data that resembles the data it was trained on.
This can include a wide range of outputs, such as text, images, music, and even voice. For example, a generative AI model could be trained on a dataset of paintings and then generate a new painting that resembles those it was trained on. The process is as follows:
One of the most well-known types of generative AI models is Generative Adversarial Networks (GANs). These comprise two parts, as follows:
The two parts work together to improve the quality of the generated content. Other examples of generative models include Variational Autoencoders (VAEs) and Transformer models like GPT-3 for text generation.
GPT in AI stands for Generative Pretrained Transformer.
It's a type of artificial intelligence model developed by OpenAI for natural language processing tasks, such as translation, question answering, and text generation.
The "generative" part refers to the model's ability to generate creative outputs, such as writing a story or an essay.
"Pretrained" means that the model has been previously trained on a large amount of text data, allowing it to generate coherent and contextually relevant sentences.
The "transformer" part refers to the model's architecture, which uses a mechanism called attention to weigh the influence of different words when generating an output.
Graphics processing units (GPUs) or Graphics Processing Unit, is a type of processor that's designed to handle tasks related to rendering graphics, particularly for gaming, 3D modelling, and video editing.
However, in recent years, GPUs have also become popular in the field of computing for their ability to perform parallel operations on large blocks of data, making them ideal for tasks such as machine learning, deep learning, and other data-intensive tasks. Further details follow.
In the context of AI and machine learning, GPUs can significantly speed up the training process for neural networks, as these tasks involve a lot of matrix and vector operations, which can be parallelized effectively on a GPU.
Groupthink is a psychological phenomenon that occurs within a group of people, in which the desire for harmony or conformity in the group results in an irrational or dysfunctional decision-making outcome.
Group members try to minimize conflict and reach a consensus decision without critical evaluation of alternative viewpoints, by actively suppressing dissenting viewpoints, and by isolating themselves from outside influences.
It was first researched by Irving Janis in the 1970s, who explained that groupthink occurs when a group makes faulty decisions because group pressures lead to a deterioration of “mental efficiency, reality testing, and moral judgment”.
Groupthink can lead to poor decision making since it discourages creativity and individual responsibility. It is a common issue in team settings, especially in situations where team cohesiveness is high and there is a lack of open communication.
Guardrails in Generative AI are a set of constraints or rules that are put in place to guide the AI system in generating outputs. They are used to ensure that the AI system operates within certain boundaries and does not produce undesirable or inappropriate results.
For example, in a text generation AI, guardrails could be set to prevent the system from generating text that includes profanity, hate speech, or sensitive information. They can also be used to guide the AI towards generating more creative or diverse outputs.
Guardrails are an important part of responsible AI usage, as they help to ensure that AI systems are used in a way that is ethical, fair, and respects user privacy.
They are a key tool in managing the risks associated with AI and are increasingly being used in a wide range of AI applications, from chatbots to content generation.
In the context of AI, particularly in generative models, "hallucinations" refer to instances where the AI generates or outputs information that wasn't in the input data. This can happen when the AI model makes assumptions or fills in gaps based on its training data.
For example, in image generation, an AI might "hallucinate" details that weren't in the original image, such as adding extra objects or altering colours. In text generation, an AI might generate details or facts that weren't mentioned or implied in the input text.
These hallucinations can sometimes lead to creative and unexpected results, but they can also lead to inaccuracies and mistakes. It's one of the challenges in developing and working with generative AI models.
It's important to note that these hallucinations are not conscious or intentional on the part of the AI, but rather a result of the statistical patterns it has learned during its training process.
Ideation is the creative process of generating, developing, and communicating new ideas.
It involves several stages, from the initial generation of ideas to their development and realization.
In a business context, ideation can be used to generate new product ideas, improve existing services, or devise solutions to problems. It often involves brainstorming sessions, collaboration, and iterative feedback.
In the context of information technology (IT), integrations refer to the process of combining different computing systems and software applications physically or functionally, to act as a coordinated whole.
The goal of integration is to create a seamless flow of data between various IT systems and software applications.
Integrations can be done in several ways, including through APIs (Application Programming Interfaces), middleware, or even manually (though this is less common due to the high potential for error and inefficiency). Further details follow.
ISO 27001 is an international standard on how to manage information security.
The standard was originally published jointly by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) in 2005 and then revised in 2013.
ISO 27001 provides a framework for establishing, implementing, maintaining, and continually improving an Information Security Management System (ISMS).
An ISMS is a systematic approach to managing sensitive company information so that it remains secure. It includes people, processes, and IT systems by applying a risk management process and gives assurance to company stakeholders that risk is being managed effectively.
Key sections of ISO 27001 include:
Companies that meet the requirements of the standard can be certified by an accredited certification body following successful completion of an audit.
Knowledge worker productivity refers to the efficiency and effectiveness with which knowledge workers, such as analysts, managers, software developers, or consultants, can produce valuable outputs from their work.
Knowledge workers primarily deal with information and knowledge, rather than physical tasks or manual labour. Their productivity is often harder to measure than that of manual workers, as the outputs of their work are often intangible, and their tasks are less repetitive and more complex. Examples follow.
Improving knowledge worker productivity can involve a variety of strategies, such as providing better tools and technologies, improving work processes, providing training and development opportunities, and creating a supportive work environment that encourages creativity and problem-solving.
Large language models (LLMs) make up a class of foundation models that can process massive amounts of unstructured text and learn the relationships between words or portions of words, known as tokens.
This enables LLMs to generate natural-language text, performing tasks such as summarisation or knowledge extraction. GPT-4 is an example of an LLM.
Large Language Models are a type of artificial intelligence model that have been trained on a vast amount of text data. They are designed to generate human-like text based on the input they are given.
These models, such as GPT-4 by OpenAI, have hundreds of billions of parameters and are trained on diverse Internet text.
However, they do not know specifics about which documents were in their training set or have access to any personal or confidential information unless it has been shared with them in the course of the conversation.
Large Language Models can generate creative writing, answer questions, translate languages, and even write software code, among other tasks. They analyze the input they're given and generate output based on patterns and structures they've learned during training.
Despite their capabilities, these models have limitations. They do not understand text in the way humans do and can sometimes write incorrect or nonsensical responses.
They are sensitive to the input they're given and can sometimes generate inappropriate or biased content.
They also lack the ability to provide personal experiences or opinions, as they do not have access to personal data unless explicitly provided during the conversation.
Machine Learning (ML) is a subset of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It focuses on the development of computer programs that can access data and use it to learn for themselves.
The process of learning begins with observations or data, such as examples, direct experience, or instruction, to look for patterns in data and make better decisions in the future based on the examples that we provide.
The primary aim is to allow the computers to learn automatically without human intervention or assistance and adjust actions accordingly. Three examples follow.
Machine learning is used in a range of computing tasks where designing and programming explicit algorithms with good performance is difficult or infeasible; for example, in applications including email filtering, detection of network intruders, and computer vision.
Measuring the productivity of knowledge workers can be challenging due to the intangible and often complex nature of their work.
However, there are several methods that can be used to assess their productivity, as follows:
It is important to recognise that the best approach often involves a combination of these methods, and the specific metrics used can vary depending on the role and the organization.
It's also important to ensure that the methods used to measure productivity align with the organization's overall goals and values.
In the context of information technology (IT) and Artificial Intelligence (AI), modality refers to the way information is represented or the type of data that is being processed.
Different modalities include text, images, audio, video, and more. Each of these modalities requires different techniques for processing and analysis.
For example, text data might be processed using natural language processing (NLP) techniques, while image data might be processed using computer vision techniques.
In AI, multi-modal systems are those that can process and integrate information from multiple different modalities.
For example, a multi-modal AI system might be able to understand a video by processing both the visual data (the images) and the audio data (the soundtrack).
Natural language processing (NLG) is a subfield of artificial intelligence that focuses on the interaction between computers and humans through natural language.
The ultimate objective of NLP is to read, decipher, understand, and make sense of the human language in a valuable way.
NLP involves several tasks and techniques, including but not limited to:
NLP is used in a variety of applications, including language translation, sentiment analysis, chatbots, voice assistants (like Siri or Alexa), and more. It's a key technology for enabling human-computer interaction in a natural, intuitive way.
A neural network is a computing model inspired by the way biological brains work. It's designed to simulate the behaviour of interconnected brain cells (neurons) in order to solve complex tasks.
Neural networks are a key part of artificial intelligence (AI) and are used for tasks that require pattern recognition, decision-making, and learning from experience. They're particularly effective for processing unstructured data, such as images and natural language text.
Here's a simple breakdown of a neural network:
Each node in the hidden layers represents a neuron and contains an activation function that determines whether and to what extent that signal should progress further through the network.
The process of adjusting the weights based on the output of the activation functions is known as training the neural network.
Neural networks are used in a variety of applications, including image and speech recognition, natural language processing, and recommendation systems.
Penetration testing, also known as pen testing or ethical hacking, is a practice in software security where a system, network, or web application is intentionally attacked to identify potential security vulnerabilities that could be exploited by hackers.
The goal of penetration testing is to uncover weak spots in an organization's security posture before attackers do. It can involve the use of automated tools or manual techniques.
The main objectives of penetration testing include:
Productivity refers to the efficiency of a person, machine, factory, system, etc., in converting inputs into useful outputs. It is generally measured by the rate at which products are generated by a system running at full capacity.
In the context of business, productivity is a measure of how effectively resources (like labour, capital, and materials) are used to produce goods and services. A higher productivity means that more goods and services are produced with the same amount of resources.
Further details follow.
Improving productivity is a major goal in many organizations, as it can lead to increased profitability. This can be achieved through various means, such as implementing new technologies, improving processes, or enhancing worker skills.
Prompt engineering refers to the process of designing, refining, and optimising input prompts to guide a generative AI model toward producing a desired and accurate outputs.
Prompt engineering is a technique used in the field of artificial intelligence, particularly with language models, to craft effective prompts that guide the model to produce the desired output.
A "prompt" is the input given to a language model to which it responds. For example, if you're using a language model to write an email, the prompt might be the beginning of the email, and the model's task is to complete it.
Prompt engineering involves designing and optimizing these prompts to get the most useful and accurate responses from the model. This can involve specifying the format of the desired answer, providing examples of correct answers, or asking the model to think step-by-step or debate pros and cons before settling on an answer.
`Prompt engineering is an important skill when working with large language models, as the design of the prompt can significantly influence the quality of the model's output.
Return on investment (ROI) is a financial metric that is widely used to measure the probability of gaining a return from an investment. It is a ratio that compares the gain or loss from an investment relative to its cost.
The formula to calculate ROI follows:
ROI = (Net Profit / Cost of Investment) * 100%
The ROI is expressed as a percentage. If it's positive, it means the investment gained value; if it's negative, the investment lost value.
For example, if you invest $1,000 in a project and earn $1,200 in return, your ROI would be 20%.
A higher ROI indicates a greater return on investment.
Self-attention, also known as intra-attention, is a mechanism used in artificial intelligence models, particularly in natural language processing (NLP), that helps the model to focus on different parts of the input when producing an output.
It's a key component of Transformer models, which are widely used in NLP tasks.
In the context of NLP, self-attention allows a model to weigh the importance of words in an input sequence when generating an output sequence.
For example, when translating a sentence from one language to another, the model uses self-attention to determine which words in the input sentence are most relevant to each word in the output sentence. The process is outlined as follows:
Self-attention allows the model to capture dependencies between words regardless of their distance in the sentence, making it very effective for many NLP tasks. It's one of the key innovations behind models like GPT-4.
Status quo bias is a cognitive bias that refers to the preference for the current state of affairs.
In other words, people are generally inclined to resist change and prefer to maintain things as they are. Examples follow:
In business and economics, understanding status quo bias can be important as it can influence consumer behaviour, investment decisions, and policy making.
SOC 2 Type 2 is a type of audit report that focuses on a service organization's non-financial reporting controls as they relate to security, availability, processing integrity, confidentiality, and privacy of a system.
The SOC 2 Type 2 report is issued by an independent auditing firm and is part of the American Institute of CPAs (AICPA)'s Service Organization Control reporting platform.
The key difference between a SOC 2 Type 1 and SOC 2 Type 2 report is that a Type 1 report is concerned with the suitability of the design of controls at a specific point in time, whereas a Type 2 report also includes the operating effectiveness of those controls over a specified review period, typically 6 months to a year.
The SOC 2 Type 2 report includes:
SOC 2 Type 2 reports are typically requested by stakeholders (e.g., customers, regulators, business partners, suppliers) of the service organization who need assurance about the controls at the organization that affect the security, availability, and processing integrity of the systems the service organization uses to process users' data, and the confidentiality and privacy of the information processed by these systems.
Structured data are tabular data, such as data organised in tables, databases or spreadsheets. This data can be used to train some machine learning models effectively.
Structured data refers to any data that resides in a fixed field within a record or file. This includes data contained in relational databases and spreadsheets, where methods of structuring the data are applied.
The key aspect of structured data is that it's organized in a manner that's easily understandable by machine. The structure is rigid and the data usually follows a specific schema, meaning it's organized in a predefined manner and with a set pattern. Examples follow:
Structured data is the opposite of unstructured data, which is data that doesn't have a predefined schema or isn't organized in a predefined manner, such as emails, social media posts, and word processing documents.
"Under the hood" in the context of prompt engineering in AI refers to the inner workings or mechanisms of the AI model that are not immediately visible or apparent to the user.
It's a metaphor derived from car terminology, where "under the hood" refers to the engine and other mechanical parts located under the car's hood.
In prompt engineering, "under the hood" could refer to:
Understanding what's happening "under the hood" can help in designing effective prompts, interpreting the model's outputs, and troubleshooting any issues.
Tokens in Generative AI refer to the smallest units of data that the model can understand and generate.
In the context of text generation, a token can be as small as a character or as large as a word or even a sentence, depending on how the model is designed.
For example, in OpenAI's GPT-4, a token is equivalent to a chunk of text, which can be as short as one character or as long as one word (e.g., 'a', 'be', 'car', 'hello'). The model reads in these tokens one at a time and uses them to predict what comes next.
Tokens are crucial in Generative AI because they determine the granularity of the model's understanding and generation capabilities. A model that operates on word-level tokens might generate more coherent text but at the cost of missing out on finer details that a character-level model could capture.
Training an AI model refers to the process of learning from data.
It's a phase in the development of an AI system where the model learns to make predictions or decisions based on input data. The process follows.
The goal of training an AI model is to create a system that can make accurate predictions or decisions based on new, unseen data.
Transformers are a relatively new neural network architecture that relies on self-attention mechanisms to transform a sequence of inputs into a sequence of outputs while focusing its attention on important parts of the context around the inputs.
Transformers are a type of model architecture used in the field of artificial intelligence, specifically in natural language processing (NLP). They were introduced in a paper titled "Attention is All You Need" by Vaswani et al., from Google Brain, in 2017.
The key innovation of Transformers is the self-attention mechanism, which allows the model to weigh the importance of words in an input sequence when generating an output sequence.
This is particularly useful in tasks like machine translation, where the meaning of a word can depend on its context in a sentence.
Transformers have been the basis for many state-of-the-art models in NLP, such as BERT (Bidirectional Encoder Representations from Transformers), GPT (Generative Pretrained Transformer), and others.
Use cases are targeted applications to a specific business challenge that produces one or measurable outcomes.
For example, in marketing, generative AI could be used to generate creative content such as personalised emails.
Unstructured data lacks a consistent format or structure.
Unstructured data includes, for example, text, images and audio files.
Typically, unstructured data requires more advanced techniques to extract insights.
Unstructured data refers to information that either does not have a pre-defined data model or is not organized in a pre-defined manner. This type of data is typically text-heavy, but may contain data such as dates, numbers, and facts as well.
Examples of unstructured data include:
Unstructured data can be stored in a variety of ways, including in data lakes or NoSQL databases.
Despite its lack of structure, this data can be extremely valuable if analyzed properly, with the help of advanced technologies like natural language processing, text analytics, data mining, and machine learning.
A walled-garden in the context of AI refers to a closed system where the operations and processes are controlled by the system's owner or operator.
This term is often used to describe AI systems that operate in a restricted environment and do not allow for third-party integrations or access to their underlying data or algorithms.
Here's a simple breakdown:
In the context of AI, a walled-garden can be beneficial for maintaining control over the quality and security of an AI system. However, it can also limit the system's ability to learn from diverse data sources and integrate with other systems.