Artificial Intelligence

Technical Glossary

Glossary of AI in Technical Terms

A bewildering array of new terms has accompanied the rise of artificial intelligence, a technology that aims to mimic human thinking.

This glossary provides a guide to some of the most important concepts and terms behind AI to help demystify one of the most impactful technology revolutions in our lifetime.

In addition, selected business terms used in the downloadable whitepapers are included.

 

Technical Terms

Artificial Intelligence Models

AI models, or Artificial Intelligence models, are computational structures designed to learn from data. They are used to make predictions or decisions without being explicitly programmed to perform the task. These models are built using algorithms and trained using data.

AI models can be categorized into several types, including supervised learning models, unsupervised learning models, semi-supervised learning models, and reinforcement learning models.

  •  Supervised Learning Models: These models are trained using labelled data, i.e., data that includes both the input and the expected output.
  •  Unsupervised Learning Models: These models are used when the training data is not labelled, i.e., the model needs to find patterns in the input data on its own.
  •  Semi-Supervised Learning Models: These models use a combination of a small amount of labelled data and a large amount of unlabelled data for training.
  •  Reinforcement Learning Models: These models learn by interacting with their environment.

Algorithm

An algorithm is a step-by-step procedure or a set of rules to be followed in calculations or other problem-solving operations, especially by a computer. It's a detailed series of instructions for carrying out an operation or solving a problem.

In a computing context, algorithms are essential because they define the specific steps that a computer program needs to take to carry out a specific task. Whether it's sorting data, making calculations, or anything else, algorithms are what programs use to get things done.

Application Programming Interface (API)

An application programming interface (API) is a way to programmatically access other pieces of software or data sets. This includes a set of rules and protocols for building and interacting with software applications. It defines the methods and data formats that a program can use to communicate with other programs or components.

APIs allow different software systems to interact with each other, enabling them to share data and functionalities. They act as a bridge between different software applications, allowing them to work together.

Several example types follow:

  •  Web APIs: Also known as HTTP APIs or REST APIs, these allow communication between different web services. For example, a web API might allow a third-party application to access specific data from a web service.
  •  Operating System APIs: These define how different software applications interact with the operating system. For example, an operating system API might allow a software application to create a file, start a print job, or open a network connection.
  •  Library or Framework APIs: These provide pre-defined functions and methods that can be used to perform specific tasks, saving developers from having to write this code themselves.

 APIs are essential in modern software development, enabling the creation of more complex and feature-rich applications by allowing different software components to work together seamlessly.

Artificial General Intelligence

Artificial general intelligence, also known as (AGI), refers to a type of Artificial Intelligence that can understand, learn, adapt, and implement knowledge in a broad range of tasks at a level equal to or beyond a human being.

Unlike Narrow AI, which is designed to perform a specific task, such as voice recognition, AGI can theoretically perform any intellectual task that a human being can do. It can understand, interpret, and respond to its environment in a way that's indistinguishable from a human.

A simple comparison follows:

  • Narrow AI: Can beat world champions at chess (e.g., IBM's Deep Blue), but can't recognize faces or understand natural language.
  • AGI: Can beat world champions at chess, recognize faces, understand natural language, perform business analysis, write like a journalist, and adapt to new tasks it's never seen before.

It's important to note that as of now, AGI remains largely theoretical. While we have many examples of Narrow AI, we have yet to create a system that exhibits AGI.

Artificial Intelligence (AI)

Artificial intelligence (AI) is the ability of software to perform tasks that traditionally require human intelligence. AI, or Artificial Intelligence, refers to the simulation of human intelligence processes by machines, especially computer systems.

These processes include learning (the acquisition of information and rules for using the information), reasoning (using the rules to reach approximate or definite conclusions), and self-correction.

AI can be categorized into two types:

  1. Narrow AI: These are systems designed to perform a narrow task (e.g., facial recognition, voice command, driving a car). Most of the AI that we encounter on a day-to-day basis is Narrow AI.
  2. Artificial General Intelligence (AGI): This is a type of AI that has all the abilities of human intelligence. AGI can understand, learn, adapt, and implement knowledge in a broad range of tasks at a level equal to or beyond a human being. AGI is currently theoretical and doesn't exist yet.

AI technologies include machine learning (where a computer system is fed large amounts of data, which it then uses to learn how to carry out a specific task), neural networks, and natural language processing.

AI has a wide range of applications, from voice assistants like Siri and Alexa, to recommendation systems used by Netflix and Amazon, to autonomous vehicles and more.

Big Four Firms

The term "Big Four" refers to the four largest international professional services networks, offering cyber security, audit, tax, consulting, advisory, corporate finance, and legal services.

The Big Four firms are:

  1. Deloitte
  2. PricewaterhouseCoopers (PwC)
  3. Ernst & Young (EY)
  4. KPMG

Chatbots

A chatbot is a software application designed to simulate human conversation. It interacts with users through messaging platforms, websites, mobile apps, or through voice command interfaces.

Chatbots can be rule-based or powered by artificial intelligence. Rule-based chatbots can only respond to specific commands, while AI-powered chatbots use machine learning and natural language processing to understand and respond to a wider range of inputs in a more conversational manner.  The two types of chatbots are described below.

  •  Rule-based chatbots: These bots follow pre-determined rules, often created through a decision tree. They can only respond to specific commands and if a user says something that isn't programmed into the bot, it won't be able to respond effectively.
  •  AI-powered chatbots: These bots use Natural Language Processing (NLP) and machine learning to understand user input, even if it's not a specific command. They can learn from past interactions to improve their responses over time.

Chatbots are used in a variety of applications, including customer service, information retrieval, and even in therapeutic contexts. They can provide 24/7 support, answer frequently asked questions, and help guide users through websites or applications.

Cognitive Bias

A cognitive bias is a systematic error in thinking that affects the decisions and judgments that people make. It's a kind of mental shortcut, often based on individual perceptions and past experiences, which can lead to distortions in how we perceive reality.

Cognitive biases can lead to perceptual blindness, inaccurate judgments, illogical interpretations, or what is broadly called irrationality. They are often a result of our brain's attempt to simplify information processing.

Some common examples of cognitive biases include:

  • Confirmation Bias: The tendency to search for, interpret, favour, and recall information in a way that confirms one's preexisting beliefs or hypotheses.
  • Hindsight Bias: Sometimes called the "knew-it-all-along" effect, the tendency to see past events as being predictable at the time those events happened.
  • Anchoring Bias: The tendency to rely too heavily on the first piece of information encountered (the "anchor") when making decisions.
  • Availability Heuristic: The tendency to overestimate the likelihood of events with greater "availability" in memory, which can be influenced by how recent the memories are or how emotionally charged they are.

Data Repositories

Data repositories are centralized places where data is stored and maintained. A repository can be a place where multiple databases or files are located for distribution over a network, or a simple place where data is deposited for safekeeping.

Data repositories are often used to store various types of data such as raw data, curated data, metadata, and relational databases. They can be physical or virtual, and can be used for data backup, archiving, and data sharing purposes.

Deep Learning

Deep Learning is a subset of machine learning, which is essentially a neural network with three or more layers. These neural networks attempt to simulate the behaviour of the human brain—albeit far from matching its ability—to "learn" from large amounts of data.

While a neural network with a single layer can still make approximate predictions, additional hidden layers can help optimize and refine for accuracy.  Three layers are described below.

  •  Input Layer: This is where the network receives input from your dataset. The type and quantity of input nodes can vary depending on what data you're working with.
  •  Hidden Layers: These layers are where the neural network processes inputs using weighted connections. The weights are adjusted during training.
  •  Output Layer: This is where the network outputs a vector of values that are in a format suitable for the type of problem to be addressed.

Deep learning drives many artificial intelligence (AI) applications and services that improve automation, performing tasks such as image and speech recognition, and natural language processing. They're able to recognize patterns with extreme accuracy, given enough data.

Encryption of Data in Motion

Encryption of data in motion, also known as data in transit, refers to the process of protecting data while it is being transferred from one location to another. This could be across the internet or through a private network.

The goal of encrypting data in motion is to ensure that the data, if intercepted during transmission, cannot be read, or understood by anyone who is not the intended recipient. This is typically achieved using various encryption protocols such as Secure Sockets Layer (SSL), Transport Layer Security (TLS), or Internet Protocol Security (IPSec).  Further details follow.

  • Secure Sockets Layer (SSL): This is a standard security technology for establishing an encrypted link between a server and a client.
  • Transport Layer Security (TLS): This is an updated, more secure version of SSL. It works in much the same way as the SSL, by encrypting the data that is being transmitted over the network.
  • Internet Protocol Security (IPSec): This is a set of protocols developed by the Internet Engineering Task Force (IETF) to support secure exchange of packets at the IP layer.

 Encrypting data in motion is a critical component of data security.

Encryption of Data at Rest

Encryption of data at rest refers to the process of protecting inactive data stored physically in any digital form, including databases, data warehouses, spreadsheets, archives, tapes, off-site backups, mobile devices, or in the cloud.

The goal of encrypting data at rest is to ensure that sensitive data is not accessible without proper authorization, even if the storage medium or device is stolen or compromised. This is typically achieved using various encryption methods such as Advanced Encryption Standard (AES), RSA, or Twofish.  Further details follow.

  •  Advanced Encryption Standard (AES): This is a symmetric encryption algorithm that is widely used across the globe.
  •  RSA: This is an asymmetric encryption algorithm used in the encryption of data and digital signatures.
  •  Twofish: This is a symmetric key block cipher with a block size of 128 bits and key sizes up to 256 bits.

 Encrypting data at rest is a critical component of data security.

Endowment Effect

The endowment effect, in behavioural economics, refers to an emotional bias where people value a good or service more once their property right to it has been established.

In other words, people tend to place a higher value on objects they own than objects that they do not. This is sometimes also referred to as "divestiture aversion".

The endowment effect was first theorized by Richard Thaler, who suggested that people value things more highly as soon as they own them. This can lead to decision-making that is not in the individual's best economic interest.

For example, if you buy a concert ticket for $50, but on the day of the concert, you wouldn't be willing to pay more than $30 for the same ticket, you're still likely to go to the concert because you own the ticket and therefore value it more.

This effect has significant implications in areas such as market prices and the concept of a "fair" price.

Foundation Models (FMs)

Foundation models (FMs) are deep learning models trained on vast quantities of structured and unstructured, unlabelled data.  Foundation models can be used for a wide range of tasks as is or adapted to specific tasks with fine-tuning. 

Examples of these models include GPT-4, DALL-2 and Stable Diffusion. Foundation models are a class of AI models that are pre-trained on a broad range of internet text and can be fine-tuned for specific tasks.

They are called "foundation" models because they serve as a base upon which a wide range of downstream models and applications can be built.

These models, such as GPT-4 by OpenAI, have demonstrated impressive performance on a variety of tasks, including translation, question-answering, and text generation, often achieving state-of-the-art results.  Further detail follow.

  • Pre-training: The model is trained on a large corpus of text data. This is usually a time-consuming process, but it only needs to be done once. The model learns to understand language and recognize patterns from this data.
  •  Fine-tuning: The pre-trained model is then trained on a smaller, task-specific dataset. The parameters of the model are adjusted (or "fine-tuned") to optimize its performance on the new task.

While foundation models have shown great promise, they also raise important questions and challenges related to their deployment, including issues of fairness, interpretability, robustness, and their economic and societal impacts.

Fine-tuning

Fine-tuning in AI refers to the process of taking a pre-trained model (a model that has been trained on a large-scale dataset) and adapting it to a specific task. This is done by continuing the training process on a smaller, task-specific dataset, and adjusting the model's parameters to optimize its performance on the new task.

The idea behind fine-tuning is that the pre-trained model has already learned a lot of useful, general-purpose features from the large-scale dataset, and only minor modifications are needed to adapt it to the specific task.

Fine-tuning is a common practice in many areas of AI, including computer vision and natural language processing. For example, in natural language processing, models like GPT-4 are often fine-tuned on a specific task, such as sentiment analysis or question answering, to achieve state-of-the-art performance.

Force Multiplier

A force multiplier in business refers to a factor or a combination of factors that gives personnel, a team, or a system the ability to accomplish greater feats than without it.

It's a tool or capability that significantly increases the potential output or effectiveness, such as AI which automates certain repetitive tasks.

In the context of business, a force multiplier could be a piece of technology, a new process, or a strategic partnership that dramatically increases productivity or growth.

Generative AI

Generative AI is AI that is typically built using foundation models and has the capabilities that earlier AI did not have, such as the ability to generate content. 

Foundation models can also be used for nongenerative purposes, such as classifying user sentiment as negative or positive, based on call transcripts. Generative AI offer significant improvement over earlier model use cases.

Generative AI is a subset of artificial intelligence that focuses on creating new content. It's a type of machine learning that allows computers to generate data that resembles the data it was trained on.

This can include a wide range of outputs, such as text, images, music, and even voice. For example, a generative AI model could be trained on a dataset of paintings and then generate a new painting that resembles those it was trained on. The process is as follows:

  • Training: The model is trained on a large dataset of a specific type of content (e.g., text, images, music).
  • Generation: Once trained, the model can generate new content that resembles the training data.
  • Applications: Generative AI has a wide range of applications, including creating art, writing text, synthesizing music, designing products, and more.

Generative Adversarial Networks (GANs)

One of the most well-known types of generative AI models is Generative Adversarial Networks (GANs).  These comprise two parts, as follows:

  •  Generator: creates the new content
  •  Discriminator: tries to distinguish between the generated content and the real training data.

The two parts work together to improve the quality of the generated content. Other examples of generative models include Variational Autoencoders (VAEs) and Transformer models like GPT-3 for text generation.

GPT

GPT in AI stands for Generative Pretrained Transformer.

It's a type of artificial intelligence model developed by OpenAI for natural language processing tasks, such as translation, question answering, and text generation.

The "generative" part refers to the model's ability to generate creative outputs, such as writing a story or an essay.

"Pretrained" means that the model has been previously trained on a large amount of text data, allowing it to generate coherent and contextually relevant sentences.

The "transformer" part refers to the model's architecture, which uses a mechanism called attention to weigh the influence of different words when generating an output.

Graphics Processing Unit

Graphics processing units (GPUs) or Graphics Processing Unit, is a type of processor that's designed to handle tasks related to rendering graphics, particularly for gaming, 3D modelling, and video editing.

However, in recent years, GPUs have also become popular in the field of computing for their ability to perform parallel operations on large blocks of data, making them ideal for tasks such as machine learning, deep learning, and other data-intensive tasks.  Further details follow.

  • Parallel Processing: Unlike CPUs (Central Processing Units) that are designed to handle a few complex tasks at a time, GPUs are designed to handle hundreds or thousands of simple tasks simultaneously. This makes them particularly good at performing operations on large blocks of data at once.
  • High Bandwidth Memory: GPUs typically have access to high-speed memory technologies, which allow them to quickly read and write large amounts of data.
  • Specialized Hardware: GPUs have specialized hardware for certain types of calculations, such as those used in graphics rendering and machine learning. This can make them more efficient than CPUs for these specific tasks. 

In the context of AI and machine learning, GPUs can significantly speed up the training process for neural networks, as these tasks involve a lot of matrix and vector operations, which can be parallelized effectively on a GPU.

Groupthink

Groupthink is a psychological phenomenon that occurs within a group of people, in which the desire for harmony or conformity in the group results in an irrational or dysfunctional decision-making outcome.

Group members try to minimize conflict and reach a consensus decision without critical evaluation of alternative viewpoints, by actively suppressing dissenting viewpoints, and by isolating themselves from outside influences.

It was first researched by Irving Janis in the 1970s, who explained that groupthink occurs when a group makes faulty decisions because group pressures lead to a deterioration of “mental efficiency, reality testing, and moral judgment”.

Groupthink can lead to poor decision making since it discourages creativity and individual responsibility. It is a common issue in team settings, especially in situations where team cohesiveness is high and there is a lack of open communication.

Guardrails

Guardrails in Generative AI are a set of constraints or rules that are put in place to guide the AI system in generating outputs. They are used to ensure that the AI system operates within certain boundaries and does not produce undesirable or inappropriate results.

For example, in a text generation AI, guardrails could be set to prevent the system from generating text that includes profanity, hate speech, or sensitive information. They can also be used to guide the AI towards generating more creative or diverse outputs.

Guardrails are an important part of responsible AI usage, as they help to ensure that AI systems are used in a way that is ethical, fair, and respects user privacy.

They are a key tool in managing the risks associated with AI and are increasingly being used in a wide range of AI applications, from chatbots to content generation.

Hallucinations

In the context of AI, particularly in generative models, "hallucinations" refer to instances where the AI generates or outputs information that wasn't in the input data. This can happen when the AI model makes assumptions or fills in gaps based on its training data.

For example, in image generation, an AI might "hallucinate" details that weren't in the original image, such as adding extra objects or altering colours. In text generation, an AI might generate details or facts that weren't mentioned or implied in the input text.

These hallucinations can sometimes lead to creative and unexpected results, but they can also lead to inaccuracies and mistakes. It's one of the challenges in developing and working with generative AI models.

It's important to note that these hallucinations are not conscious or intentional on the part of the AI, but rather a result of the statistical patterns it has learned during its training process.

Ideation

Ideation is the creative process of generating, developing, and communicating new ideas.

It involves several stages, from the initial generation of ideas to their development and realization.

In a business context, ideation can be used to generate new product ideas, improve existing services, or devise solutions to problems. It often involves brainstorming sessions, collaboration, and iterative feedback.

Integrations

In the context of information technology (IT), integrations refer to the process of combining different computing systems and software applications physically or functionally, to act as a coordinated whole.

The goal of integration is to create a seamless flow of data between various IT systems and software applications.

Integrations can be done in several ways, including through APIs (Application Programming Interfaces), middleware, or even manually (though this is less common due to the high potential for error and inefficiency).  Further details follow.

  • APIs (Application Programming Interfaces): These allow different software programs to communicate with each other, facilitating the integration.
  •  Middleware: This is software that acts as a bridge between an operating system or database and applications, especially on a network.
  •  Manual Integration: This is less common due to the high potential for error and inefficiencies.

ISO27001

ISO 27001 is an international standard on how to manage information security.

The standard was originally published jointly by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) in 2005 and then revised in 2013.

ISO 27001 provides a framework for establishing, implementing, maintaining, and continually improving an Information Security Management System (ISMS).

An ISMS is a systematic approach to managing sensitive company information so that it remains secure. It includes people, processes, and IT systems by applying a risk management process and gives assurance to company stakeholders that risk is being managed effectively.

Key sections of ISO 27001 include:

  • Risk Assessment: The company must define a security risk assessment approach and conduct the assessment.
  • Risk Treatment: The company must create a plan to manage and mitigate the risks identified in the risk assessment.
  • Objectives and Policies: The company must set information security objectives and establish policies to achieve these objectives.
  • Organization of Information Security: The company must define roles and responsibilities for information security and establish a framework for information security management.
  • Human Resource Security: The company must implement processes to ensure that employees and contractors understand their responsibilities and are suitable for the roles they are considered for.
  • Asset Management: The company must identify information assets and define appropriate protection responsibilities.
  • Access Control: The company must manage user access to information and systems.
  • Operations Security: The company must secure its operations management, system planning, protection against malware, backup, logging, and monitoring.
  • Communications Security: The company must secure its information in networks and secure its information transfer.
  • System Acquisition, Development, and Maintenance: The company must ensure that information security is a key part of its systems throughout their lifecycle.
  • Supplier Relationships: The company must protect its assets that are accessible to suppliers.
  • Information Security Incident Management: The company must establish a management process for information security events and weaknesses.
  • Information Security Aspects of Business Continuity Management: The company must address information security in its business continuity management.
  • Compliance: The company must identify applicable laws, regulations, and contractual requirements, and ensure compliance with these and with its own policies.

Companies that meet the requirements of the standard can be certified by an accredited certification body following successful completion of an audit.

Knowledge Worker Productivity

Knowledge worker productivity refers to the efficiency and effectiveness with which knowledge workers, such as analysts, managers, software developers, or consultants, can produce valuable outputs from their work.

Knowledge workers primarily deal with information and knowledge, rather than physical tasks or manual labour. Their productivity is often harder to measure than that of manual workers, as the outputs of their work are often intangible, and their tasks are less repetitive and more complex.  Examples follow.

  • Efficiency: This refers to the amount of output a knowledge worker can produce in a given amount of time. For example, how many reports an analyst can produce in a week, or how many lines of code a software developer can write in a day.
  • Effectiveness: This refers to the quality of the output a knowledge worker produces. For example, the accuracy of an analyst's predictions, or the functionality of a software developer's code.

Improving knowledge worker productivity can involve a variety of strategies, such as providing better tools and technologies, improving work processes, providing training and development opportunities, and creating a supportive work environment that encourages creativity and problem-solving.

Large Language Models (LLMs)

Large language models (LLMs) make up a class of foundation models that can process massive amounts of unstructured text and learn the relationships between words or portions of words, known as tokens. 

This enables LLMs to generate natural-language text, performing tasks such as summarisation or knowledge extraction. GPT-4 is an example of an LLM.

Large Language Models are a type of artificial intelligence model that have been trained on a vast amount of text data. They are designed to generate human-like text based on the input they are given.

These models, such as GPT-4 by OpenAI, have hundreds of billions of parameters and are trained on diverse Internet text.

However, they do not know specifics about which documents were in their training set or have access to any personal or confidential information unless it has been shared with them in the course of the conversation.

Large Language Models can generate creative writing, answer questions, translate languages, and even write software code, among other tasks. They analyze the input they're given and generate output based on patterns and structures they've learned during training.

Despite their capabilities, these models have limitations. They do not understand text in the way humans do and can sometimes write incorrect or nonsensical responses.

They are sensitive to the input they're given and can sometimes generate inappropriate or biased content.

They also lack the ability to provide personal experiences or opinions, as they do not have access to personal data unless explicitly provided during the conversation.

Machine Learning

Machine Learning (ML) is a subset of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It focuses on the development of computer programs that can access data and use it to learn for themselves.

The process of learning begins with observations or data, such as examples, direct experience, or instruction, to look for patterns in data and make better decisions in the future based on the examples that we provide.

The primary aim is to allow the computers to learn automatically without human intervention or assistance and adjust actions accordingly.  Three examples follow.

  • Supervised Learning: The model is provided with labelled training data, and the goal is to learn a mapping from inputs to outputs.
  • Unsupervised Learning: The model is given unlabelled training data and must find structure in the inputs on its own.
  • Reinforcement Learning: The model learns by interacting with an environment, receiving rewards or penalties for different actions, and adjusting its behaviour to maximize the rewards.

Machine learning is used in a range of computing tasks where designing and programming explicit algorithms with good performance is difficult or infeasible; for example, in applications including email filtering, detection of network intruders, and computer vision.

Measuring Knowledge Worker Productivity

Measuring the productivity of knowledge workers can be challenging due to the intangible and often complex nature of their work.

However, there are several methods that can be used to assess their productivity, as follows:

  1. Output Quality: Evaluate the quality of the work produced. This could be the accuracy of a report, the effectiveness of a strategy, or the usability of a software product.
  2. Output Quantity: Count the number of tasks completed or projects delivered within a certain timeframe. However, it's important to balance this with quality considerations.
  3. Project Goals: Assess how effectively the worker meets or exceeds the goals set for specific projects.
  4. Innovation: Evaluate the worker's contribution to innovation, such as new ideas, processes, or products.
  5. Peer and Manager Reviews: Use feedback from colleagues and supervisors to assess a worker's performance.
  6. Customer Satisfaction: In roles where the worker interacts with customers or clients, their satisfaction can be a key indicator of the worker's productivity.
  7. Learning and Growth: Assess the worker's development of new skills and knowledge.

It is important to recognise that the best approach often involves a combination of these methods, and the specific metrics used can vary depending on the role and the organization. 

It's also important to ensure that the methods used to measure productivity align with the organization's overall goals and values.

Modality

In the context of information technology (IT) and Artificial Intelligence (AI), modality refers to the way information is represented or the type of data that is being processed.

Different modalities include text, images, audio, video, and more. Each of these modalities requires different techniques for processing and analysis.

For example, text data might be processed using natural language processing (NLP) techniques, while image data might be processed using computer vision techniques.

In AI, multi-modal systems are those that can process and integrate information from multiple different modalities.

For example, a multi-modal AI system might be able to understand a video by processing both the visual data (the images) and the audio data (the soundtrack).

Examples follow:

  • Text: This modality includes any data that is in written or textual form. Techniques like Natural Language Processing (NLP) are used to process and analyze this data.
  • Images: This includes any data in the form of pictures or visuals. Computer vision techniques are used to process this data.
  • Audio: This includes any data in the form of sound. Techniques like speech recognition or audio signal processing are used to analyze this data.
  • Video: This is a multi-modal type of data that includes both visual and audio information. Techniques from both computer vision and audio signal processing can be used to analyze this data.

Natural Language Processing (NLP)

Natural language processing (NLG) is a subfield of artificial intelligence that focuses on the interaction between computers and humans through natural language.

The ultimate objective of NLP is to read, decipher, understand, and make sense of the human language in a valuable way.

NLP involves several tasks and techniques, including but not limited to:

  • Text Analysis: Extracting and analyzing information from text, such as keywords, phrases, or sentiments.
  • Speech Recognition: Converting spoken language into written form.
  • Natural Language Understanding: Understanding the meaning of text, including the roles and relationships of the words, and interpreting the text accordingly.
  • Natural Language Generation: Generating text that is readable, stylistically natural, and grammatically correct.

NLP is used in a variety of applications, including language translation, sentiment analysis, chatbots, voice assistants (like Siri or Alexa), and more. It's a key technology for enabling human-computer interaction in a natural, intuitive way.

Neural Network

A neural network is a computing model inspired by the way biological brains work. It's designed to simulate the behaviour of interconnected brain cells (neurons) in order to solve complex tasks.

Neural networks are a key part of artificial intelligence (AI) and are used for tasks that require pattern recognition, decision-making, and learning from experience. They're particularly effective for processing unstructured data, such as images and natural language text.

Here's a simple breakdown of a neural network:

  • Input Layer: This is where the network receives input from your dataset. The type and quantity of input nodes can vary depending on what data you're working with.
  • Hidden Layers: These layers are where the neural network processes inputs using weighted connections. The weights are adjusted during training.
  • Output Layer: This is where the network outputs a vector of values that are in a format suitable for the type of problem to be addressed.

Each node in the hidden layers represents a neuron and contains an activation function that determines whether and to what extent that signal should progress further through the network.

The process of adjusting the weights based on the output of the activation functions is known as training the neural network.

Neural networks are used in a variety of applications, including image and speech recognition, natural language processing, and recommendation systems.

Penetration Testing

Penetration testing, also known as pen testing or ethical hacking, is a practice in software security where a system, network, or web application is intentionally attacked to identify potential security vulnerabilities that could be exploited by hackers.

The goal of penetration testing is to uncover weak spots in an organization's security posture before attackers do. It can involve the use of automated tools or manual techniques.

The main objectives of penetration testing include:

  • Identifying weak spots in an organization's security posture.
  • Validating the effectiveness of defensive mechanisms.
  • Testing the organization's compliance with security policy.
  • Testing employees' security awareness.

Productivity

Productivity refers to the efficiency of a person, machine, factory, system, etc., in converting inputs into useful outputs. It is generally measured by the rate at which products are generated by a system running at full capacity.

In the context of business, productivity is a measure of how effectively resources (like labour, capital, and materials) are used to produce goods and services. A higher productivity means that more goods and services are produced with the same amount of resources.

Further details follow.

  • Labor Productivity: This is a measure of the amount of goods and services that a worker produces in a given amount of time. It's often used to compare the productivity of different countries or regions.
  • Capital Productivity: This is a measure of how effectively a business uses its capital to generate revenue.
  • Total Factor Productivity: This is a measure of how efficiently all inputs (labour, capital, materials, etc.) are used in production.

Improving productivity is a major goal in many organizations, as it can lead to increased profitability. This can be achieved through various means, such as implementing new technologies, improving processes, or enhancing worker skills.

Prompt Engineering

Prompt engineering refers to the process of designing, refining, and optimising input prompts to guide a generative AI model toward producing a desired and accurate outputs.

Prompt engineering is a technique used in the field of artificial intelligence, particularly with language models, to craft effective prompts that guide the model to produce the desired output.

A "prompt" is the input given to a language model to which it responds. For example, if you're using a language model to write an email, the prompt might be the beginning of the email, and the model's task is to complete it.

Prompt engineering involves designing and optimizing these prompts to get the most useful and accurate responses from the model. This can involve specifying the format of the desired answer, providing examples of correct answers, or asking the model to think step-by-step or debate pros and cons before settling on an answer.

Examples follow.

  • Prompt Design: Crafting the initial input or question to guide the model's response.
  • Format Specification: Including instructions about the format of the desired answer in the prompt.
  • Example Provision: Providing examples of correct answers in the prompt.
  • Step-by-Step Thinking: Asking the model to think step-by-step or debate pros and cons.

`Prompt engineering is an important skill when working with large language models, as the design of the prompt can significantly influence the quality of the model's output.

Return on Investment (ROI)

Return on investment (ROI) is a financial metric that is widely used to measure the probability of gaining a return from an investment. It is a ratio that compares the gain or loss from an investment relative to its cost.

The formula to calculate ROI follows:

ROI = (Net Profit / Cost of Investment) * 100%

  • Net Profit: This is the gain from the investment minus the cost of the investment.
  • Cost of Investment: This is the total out-of-pocket costs for the investment.

The ROI is expressed as a percentage. If it's positive, it means the investment gained value; if it's negative, the investment lost value.

For example, if you invest $1,000 in a project and earn $1,200 in return, your ROI would be 20%.

A higher ROI indicates a greater return on investment.

Self-attention

Self-attention, also known as intra-attention, is a mechanism used in artificial intelligence models, particularly in natural language processing (NLP), that helps the model to focus on different parts of the input when producing an output.

It's a key component of Transformer models, which are widely used in NLP tasks.

In the context of NLP, self-attention allows a model to weigh the importance of words in an input sequence when generating an output sequence.

For example, when translating a sentence from one language to another, the model uses self-attention to determine which words in the input sentence are most relevant to each word in the output sentence.  The process is outlined as follows:

  1. Input: The model receives a sequence of words (or more generally, tokens) as input.
  2. Self-Attention Calculation: For each word in the input sequence, the model calculates a set of attention scores with respect to all the other words in the sequence. These scores determine how much each word in the sequence should contribute to the understanding of the current word.
  3. Output: The attention scores are used to create a weighted combination of the input words, which is used in generating the output sequence.

Self-attention allows the model to capture dependencies between words regardless of their distance in the sentence, making it very effective for many NLP tasks. It's one of the key innovations behind models like GPT-4.

Status Quo Bias

Status quo bias is a cognitive bias that refers to the preference for the current state of affairs.

In other words, people are generally inclined to resist change and prefer to maintain things as they are.  Examples follow:

  1. Resistance to Change: People with status quo bias tend to prefer the familiar and are often resistant to change, even when the change may be beneficial.
  2. Decision Making: This bias can impact decision making, with individuals favouring options that maintain the current situation over alternatives that could lead to any change.
  3. Risk Aversion: Status quo bias is often linked to risk aversion, where the potential losses from changing the status quo are perceived to be greater than the potential gains.
  4. Inertia: It can also be seen as a form of inertia, where the easiest course of action is to simply do nothing and maintain the current situation.

In business and economics, understanding status quo bias can be important as it can influence consumer behaviour, investment decisions, and policy making.

SOC 2 Type 2

SOC 2 Type 2 is a type of audit report that focuses on a service organization's non-financial reporting controls as they relate to security, availability, processing integrity, confidentiality, and privacy of a system.

The SOC 2 Type 2 report is issued by an independent auditing firm and is part of the American Institute of CPAs (AICPA)'s Service Organization Control reporting platform.

The key difference between a SOC 2 Type 1 and SOC 2 Type 2 report is that a Type 1 report is concerned with the suitability of the design of controls at a specific point in time, whereas a Type 2 report also includes the operating effectiveness of those controls over a specified review period, typically 6 months to a year.

The SOC 2 Type 2 report includes:

  • Management's Description of the Service Organization's System: This is a written assertion from management about the fairness of the presentation of the system's design and implementation and the suitability of the design of the controls to meet the applicable trust services criteria.
  • Auditor's Opinion: The auditor's opinion on the fairness of the presentation of the management's description of the service organization's system, the suitability of the design of the controls to meet the applicable trust services criteria, and the operating effectiveness of the controls to meet the applicable trust services criteria.
  • System Overview: A description of the services provided by the service organization, including the types of customers that use the service, the nature of the business operations, the nature and type of the information processed, and the relevant control objectives.
  • Tests of Controls and Results: A detailed description of the auditor's tests of controls and the results of those tests.
  • Other Information Provided by the Service Organization: This may include additional details about the service organization's controls, such as a summary of the service auditor's testing of controls.

SOC 2 Type 2 reports are typically requested by stakeholders (e.g., customers, regulators, business partners, suppliers) of the service organization who need assurance about the controls at the organization that affect the security, availability, and processing integrity of the systems the service organization uses to process users' data, and the confidentiality and privacy of the information processed by these systems.

Structured Data

Structured data are tabular data, such as data organised in tables, databases or spreadsheets. This data can be used to train some machine learning models effectively.

Structured data refers to any data that resides in a fixed field within a record or file. This includes data contained in relational databases and spreadsheets, where methods of structuring the data are applied.

The key aspect of structured data is that it's organized in a manner that's easily understandable by machine. The structure is rigid and the data usually follows a specific schema, meaning it's organized in a predefined manner and with a set pattern.  Examples follow:

  • Relational Databases: These are the most common example of structured data. They organize data into tables, rows, and columns. Each column in a table represents a certain attribute, and each row represents a single record.
  • Spreadsheets: Like databases, spreadsheets also organize data into rows and columns, making it another good example of structured data.
  • Data Types: Structured data is defined by data types, such as integers, boolean, decimal, date/time, etc.

Structured data is the opposite of unstructured data, which is data that doesn't have a predefined schema or isn't organized in a predefined manner, such as emails, social media posts, and word processing documents.

 

“Under the Hood”

"Under the hood" in the context of prompt engineering in AI refers to the inner workings or mechanisms of the AI model that are not immediately visible or apparent to the user.

It's a metaphor derived from car terminology, where "under the hood" refers to the engine and other mechanical parts located under the car's hood.

In prompt engineering, "under the hood" could refer to:

  1. Model Architecture: The structure of the AI model, including the type and arrangement of its layers, the activation functions it uses, and other design choices.
  2. Training Process: The methods and techniques used to train the AI model, including the optimization algorithm, the loss function, and the training data.
  3. Feature Extraction: The process of transforming raw data into a format that the AI model can understand and learn from.
  4. Model Parameters: The weights and biases that the AI model learns during training, which determine how it makes predictions or decisions.

Understanding what's happening "under the hood" can help in designing effective prompts, interpreting the model's outputs, and troubleshooting any issues.

Tokens

Tokens in Generative AI refer to the smallest units of data that the model can understand and generate.

In the context of text generation, a token can be as small as a character or as large as a word or even a sentence, depending on how the model is designed.

For example, in OpenAI's GPT-4, a token is equivalent to a chunk of text, which can be as short as one character or as long as one word (e.g., 'a', 'be', 'car', 'hello'). The model reads in these tokens one at a time and uses them to predict what comes next.

Tokens are crucial in Generative AI because they determine the granularity of the model's understanding and generation capabilities. A model that operates on word-level tokens might generate more coherent text but at the cost of missing out on finer details that a character-level model could capture.

Training an AI Model

Training an AI model refers to the process of learning from data.

It's a phase in the development of an AI system where the model learns to make predictions or decisions based on input data.  The process follows.

  1. Data Collection: The first step is to collect relevant data that the model can learn from. This could be images, text, audio, or numerical data, depending on the task at hand.
  2. Preprocessing: The collected data is then cleaned and transformed into a format that the AI model can understand.
  3. Model Training: The pre-processed data is fed into the AI model. The model tries to find patterns in the data that can be used to make predictions or decisions. This is done by adjusting the model's internal parameters.
  4. Evaluation: The model's performance is evaluated using a separate set of data (test data). If the performance is not satisfactory, the model may be adjusted, and the training process repeated.
  5. Deployment: Once the model is trained and evaluated, it can be deployed to perform the task it was designed for, such as recognizing images, understanding speech, or predicting stock prices.

The goal of training an AI model is to create a system that can make accurate predictions or decisions based on new, unseen data.

Transformers

Transformers are a relatively new neural network architecture that relies on self-attention mechanisms to transform a sequence of inputs into a sequence of outputs while focusing its attention on important parts of the context around the inputs.

Transformers are a type of model architecture used in the field of artificial intelligence, specifically in natural language processing (NLP). They were introduced in a paper titled "Attention is All You Need" by Vaswani et al., from Google Brain, in 2017.

The key innovation of Transformers is the self-attention mechanism, which allows the model to weigh the importance of words in an input sequence when generating an output sequence.

This is particularly useful in tasks like machine translation, where the meaning of a word can depend on its context in a sentence.

Transformers have been the basis for many state-of-the-art models in NLP, such as BERT (Bidirectional Encoder Representations from Transformers), GPT (Generative Pretrained Transformer), and others.

Use Cases

Use cases are targeted applications to a specific business challenge that produces one or measurable outcomes.

For example, in marketing, generative AI could be used to generate creative content such as personalised emails.

Unstructured Data

Unstructured data lacks a consistent format or structure. 

Unstructured data includes, for example, text, images and audio files.

Typically, unstructured data requires more advanced techniques to extract insights.

Unstructured data refers to information that either does not have a pre-defined data model or is not organized in a pre-defined manner. This type of data is typically text-heavy, but may contain data such as dates, numbers, and facts as well.

Examples of unstructured data include:

  • Text files: Word processing, spreadsheets, presentations, email, logs.
  • Email: Email has some internal structure thanks to its metadata but is largely unstructured in nature.
  • Social Media: Data from Facebook, Twitter, LinkedIn, and other social media platforms.
  • Website content: YouTube videos, Instagram photos, audio files, web pages, PDFs, etc.
  • Mobile data: Text messages, locations.
  • Communications: Chat, IM apps, phone recordings.
  • Media: MP3 files, video files, images.

Unstructured data can be stored in a variety of ways, including in data lakes or NoSQL databases.

Despite its lack of structure, this data can be extremely valuable if analyzed properly, with the help of advanced technologies like natural language processing, text analytics, data mining, and machine learning.

Walled-garden

A walled-garden in the context of AI refers to a closed system where the operations and processes are controlled by the system's owner or operator.

This term is often used to describe AI systems that operate in a restricted environment and do not allow for third-party integrations or access to their underlying data or algorithms.

Here's a simple breakdown:

  1. Controlled Environment: In a walled-garden, the system's owner has complete control over the data, algorithms, and operations. This can ensure consistency and quality, but it can also limit innovation and flexibility.
  2. Limited Access: Third parties typically cannot access or modify the data or algorithms within a walled-garden. This can protect the system's integrity and security, but it can also limit transparency and interoperability.
  3. Proprietary Systems: Many walled-gardens are proprietary systems, meaning they are owned and operated by a single entity. This can lead to monopolistic practices and limit competition.

In the context of AI, a walled-garden can be beneficial for maintaining control over the quality and security of an AI system. However, it can also limit the system's ability to learn from diverse data sources and integrate with other systems.

Have some questions? Please let us know how we can help.