Modular AI systems

Smart system cost

We will create a somewhat concrete, yet still hypothetical example.

We’ll illustrate the total cost over a year (365 days) of operation for the following scenario:

The system is a free-roaming four legged front-loader robot.
It has two arms to handle parcels and packages. The arms have sensors for weight and surface characteristics – “gentle touch” not to crush materials or human operators.

The robot operates inside a facility, warehouse or factory. It has a visual, lidar and radar system for observing and navigating the environment. That is subsystem number 1, “Navigation”.

The front-loader has a system to handle propulsion, the four legs. That is subsystem 2, “propulsion”.

The front-loader has a system to handle the materials in the warehouse with the two arms. This is subsystem 3, “payload processing”.

The front-loader has a separate visual, audio and textual UI system for interacting with human workers in the facility or elsewhere (remote connection). The front loader’s UX is friendly and based on state of the art Human Computer Interface practices. This is subsystem 4, human interaction.

The front loader is re-trained every 24 hours, i.e. 365 times per year. The initial training material for subsystems 1-4 is almost completely disjoint. Each separate training material has a high signal to noise ratio. The system is expected to handle 20 human interactions and 10 parcel operations per hour.

With these characteristics, we will compare between all ML eggs in one model v.s. four disjoint models.

Architecture alternatives

A. Monolithic model

One large multimodal model handling all subsystems jointly.
Shared latent space across navigation, control, manipulation, and HCI.
Retrained end-to-end every 24 hours.

B. Multimodel system

Four specialist models:
- $M_1$ : Navigation
- $M_2$ : Propulsion
- $M_3$ : Payload processing
- $M_4$ : Human interaction
Lightweight integration layer:
- Task router
- Shared state abstraction
Each subsystem retrained independently every 24 hours.

Because training data is almost completely disjoint and each subset has a high signal to noise -ratio, this is a best-case scenario for modularization

Parameter and scaling assumptions

These are deliberately conservative and internally consistent.

Model sizes

Let:

Monolithic model size: $P_{\text{mono}} = 10^{10} \;\text{parameters}$
Each specialist (thanks to disjoint, high-SR data): $P_i = 1.5\times10^9$

Total specialist parameters: $\sum P_i = 6\times10^9$

Modular storage is smaller, not larger. That is realistic in this case since domains barely overlap.

Training cost scaling

We’ll assume that training cost is proportional to the number of parameters P in a model. $C_{\text{train}} \propto T \cdot P$

Let one full training of the monolith cost: $C_{\text{train,mono}} = 1.0 \;\text{cost unit}$

Then per full retraining: $C_{\text{train,multi}} = 0.15 \;\text{per subsystem} \Rightarrow 0.6 \;\text{total per day}$

This reflects:

smaller models,
higher SR,
no cross-domain entanglement.

Inference activity volume and cost

Per robot:

Human interactions:
$20 \times 24 \times 365 = 175{,}200$ 20×24×365=175,200 / year
Parcel ops:
$10 \times 24 \times 365 = 87{,}600$ 10×24×365=87,600 / year

Assume each event requires:

Monolith: full model inference
Modular: 1–2 specialists activated, average = 1.5

Assume inference cost ∝ active parameters.

Monolith inference cost per event: $C_{\text{inf,mono}} \propto 10^{10}$
Modular inference cost per event: $C_{\text{inf,multi}} \propto 1.5 \times 1.5\times10^9 = 2.25\times10^9$

That is ~4.4× cheaper per interaction for the Smart system based on multiple integrated models, or Docker for AI.

Almost there: Five-year Total Cost

Training cost (5 years)

Architecture	Daily cost	Days	5-year total
Monolithic	1.0	1825	1825
Multimodel	0.6	1825	1095

Training savings: ~40%

Inference cost (5 years)

Total interactions per year: $175{,}200 + 87{,}600 = 262{,}800$ 175,200+87,600=262,800

Five years: $1.314 \times 10^6 \;\text{events}$ 1.314×106events

Architecture	Cost per event	5-year total
Monolithic	1.0	1,314,000
Multimodel	0.225	295,650

Inference savings: ~4.4×

Storage & integration (5 years)

Component	Monolithic	Multimodel
Model storage	High (10B params)	Moderate (6B params)
Integration infra	Minimal	Moderate
Net effect	Baseline	+5–10% overhead

We will conservatively add 100 cost units to multimodel TCO.

Final TCO comparison (5 years)

Cost component	Monolithic	Multimodel
Training	1,825	1,095
Inference	1,314,000	295,650
Storage + integration	~0	+100
Total TCO	~1,315,825	~296,845

And conclusions:

So what did we do and say here?

We outlined a theoretical, yet plausible system, and compare two alternative ways to build that. The architectures we compared are a single large model that handles everything (monolith), and a system built of components, i.e small independent ML models that are integrated (multimodel architecture).

The multimodel architecture is ~4.4× cheaper over 5 years, dominated by inference cost savings.

Why modular wins decisively here

Disjoint, high-signal domains
No representational duplication penalty.
Daily retraining
Training efficiency compounds strongly over time.
Sparse activation at inference
Only the relevant subsystem runs per task.
Embodied system
Most tasks are local (navigate, lift, talk), not global reasoning.

This is almost the ideal use case for modular intelligence.

In this scenario, a monolith is paying a tax for generality it does not use most of the time.

Development cost of an AI system

Ainolabs talks about systems, entities that are operated for defined purposes and with expectations of generating value to their developers, owners, users and other stakeholders. Value depends on the business case. The cost can be estimated, and we’ll get to this formulaic expression below:

Systems are developed, launched, updated, supported and eventually replaced or discarded. They have a life span, and with life span there is the life time cost. We try to estimate that on a somewhat abstract level of complexity, as discussed in Computer Science.

There are two ways to build an AI system. One is the current, train everything in a single model, possibly create agent instances in that, package the model and release and run.

It is also possible to build a system of interconnected smaller models, minimodels. That is cost effective when the system functionality splits into disjoint subdomains, and especially if the combined dataset would have very low entropy, or, put in signal processing terms, signal to noise ratio (SR) is very low in the aggregate data set, but high in the parts when compartmentalized.

Building a system with many models incurs general overhead. In the Ainolabs architecture that means the burden of Content Classifier. This adds to the processing of each input, and should be balanced by the reduced cost of processing in the separate models (Learning Processes).

Scientish-like complexity estimates of development

By AI system we mean a system that uses Machine Learning technology as an essential element. A system could be built, as mentioned above as a monolith, or as a collection of separate smaller models. The development of an ML (AI) system is mostly related to training, and so the following estimates the order of magnitude costs of that.

We’ll use the following terms:

T = number of training examples
d = feature dimension per example
P = number of model parameters (often ≈ scale of the model)
E = training epochs / outer iterations (or rounds/trees, etc.)
k = clusters or neighbors (when relevant)
F = per-example forward+backward FLOPs (architecture-dependent; ≈ proportional
to P for dense nets)

In machine learning, P represents the total number of trainable parameters — the variables
that the training process adjusts to minimize loss.

Taking noise into account

The number of parameters in the model, P, depends on the quality of the data. This has an impact on the model, training and semantic model capacity. Data with a lot of noise and no signal results in a very easy model, but with no practical value – just consider the ultimate white noise, cosmic background 4K radiation. P is conceptually a measure of information in a model.

If the training data as an aggregate is noisy – e.g. different sensory systems for different purposes look like noise when combined – splitting it might help.

Next we try to create a way to express and predict P in terms of dataset characteristics like size T , signal-to-noise ratio (SR), and desired accuracy A.

For that we need to define

For a model the number of parameters P needed is roughly proportional to the amount of true signal in your data and inversely proportional to the noise and desired accuracy.

We’ll skip a few steps and claim that the number of parameters is proportional (“Oh”-notation) to the size of the corpus T, signal to noise ration SR, and desired accuracy A. Exponents depend on the specific situation:

And with that it is possible to express the order of magnitude of training a model as

Divide et impera, or split the system into parts

The full dataset T might be large, could consist of orthogonally different data (audio, video, and temperature, for example). In such a case it would make sense to split the task – LP1, LP2 etc in the architecture.

Next total dataset has size T, and it is divided it into disjoint subsets:

Training cost for comparison between a monolith and multimodel system are

With these two it is possible compare the effectiveness of different or combined data sets for an operational system that utilizes Machine Learning technology – an AI system.

Elements of Trust

An essential part of the architecture for continuous machine learning are the trust relationships between components and participants of a system and with possible external parties – contributing actors.

Content credentials https://contentcredentials.org/ attempts to address trust by placing a mark on content. This assumes that the recipient trusts the mark, and that the marks are not abused by malevolent actors. Both are quite possible.

A system that relies on any information or action by other parties needs to determine and maintain trust information about others. Trust is a multidimensional entity, including trust in someone’s veracity, competence, good will, transient factors, or other attributes. A human example is trust based on group membership.

The architecture contains an element “Trust level adjustment”. That refers to a simple process outlined in the illustration below:

This reflects trust relationships between two agents (components), i and 0 (self). The relationship is asymmetrical on every attribute, attributes between agents may differ, and the trust levels are a function of past experience.

With such mechanism it is possible to build systems that gradually learn appropriate trust levels with their environment, The correct trust level is essential for continuous operation. It is argued that without that capability it is not possible to build an artificial intelligence system that can operate in an open environment – i.e. so-called AGI can not work without a proper trust adjustment technology in place.

Architecture for Continuous Machine Learning

Ainolabs architecture for Continous Machine Learning Systems (CMLS) is designed to support non-interrupted operation, modular construction, system and data integrity, scalability in system design and operation and robust continuous operation.

These capabilities result in a machine learning system that can meet requirements on privacy, confidentiality and adapt to changing circumstances.

The blocks and the whole CMLS entity operate as a collection of re-enforcement learning engines. The individual inputs and outputs together with the memory blocks form a learning loop over time.

A CMLS receives inputs I₀ … (to I_n, for discussion where an upper limit is needed). The CMLS responds with outputs O₀ … O_n, again an endless sequence. Inputs and outputs match each other so that each I_i results on corresponding output O_i The system can produce an empty output.

The system uses the available memory blocks Short Term Memory and Long Term Memory to keep a log of inputs and outputs. This log is stored as memory imprints in the NN of memory blocks for this discussion. An implementation may also store a traditional transaction log; however that is not relevant to ML discussion.

The system uses each new Input I_k as an cumulative observation (feedback) of earlier input and output sequences I₀ .. I_k-I and O₀ … O_k-1. This provides a CMLS with an increasing set of cumulative experience that can be used to re-enforce earlier memories or resolve ambiguities and possible contradictions.

Input and Output sequences from 0 to k are marked as I(k) and O(k) in the rest of the text. These denote the ordered sets containing items 0 to k.

A transaction is defined as an Input – Output pair, i.e. Transaction = { I, O) and specifically T_i = { I_i, O_i }. The set of transactions 0 … i is marked as T[i].

CMLS Blocks explained

CMLS Blocks accept Inputs, process those using Output drafts and finally after minimizing the overall Cost function send out the Output.

Output can be empty if the Input is not understood or the system decides it is appropriate to “no comments”.

Stimuli / input

Input is received via some digital channel. Input can represent anything, including but not limited to electromagnetic radiation, sound in any medium, video (images, separate from light), text, illustrations, diagrams or other information or signal occurring naturally or generated by humans.

Input is associated with information about its source. Source can be unknown. See Source Classification.

Content Classification

Content Classification attempts to label the content into classes stored in Long Term Memory. This classification is context dependent and is affected by recent stimuli. The specific effect depends on implementation and can be e.g. increasing or decreasing enforcement.

Content Classification can impact the memory imprint of Input both in Short Term Memory and Long Term Memory.

Source Classification

Source Classification attempts to label the Source of the Input. It uses information stored in Long Term Memory. Source Classification is context dependent.

Content Classification can impact the memory imprint of Input both in Short Term Memory and Long Term Memory.

Short term memory

Short Term Memory is keeps track of immediate input – output sequences and other transient pieces of information passing through the CMLS. It should be noted that unlike humans, a digital CMLS Short Term Memory is limited only by available computing power and data storage.

Short Term Memory overflows strong memory imprints to Long Term Memory when it is either full or the Input is a wide match with the Short Term Memory’s Neural Network (substantial part of STM’s neurons fire up).

Long term memory

Long Term Memory is the mass storage of CMLS. All permanent memory imprints are stored there. The capacity of Long Term Memory is essential for CMLS – the more synapses, the more knowledge.

Long Term Memory is essential for storage (learn), retrieval of information, and also for graceful forgetting. As CMLS is a digital device, there is no biological degradation. Graceful forgetting is intentional, and can be based on time, source, or other factors. Notably, Long Term Memory stores timestamps with new information and can fade or completely remove memories that are valid only for a limited period of time.

Short Term Memory has a substantial role in time-related graceful forgetting. Fleeting information is kept only in short term memory and will not leave an imprint on Long Term Memory at all. As an example, consider weather or traffic conditions day before yesterday. Those can be recalled for a few days, but now e.g. a month ago unless there was a specific and pressing reason to store the memories in Long Term Memory.

Learning process for new inputs

Learning process takes Input and associated Content and Source classifications and runs learning processes on Short Term Memory. See also Re-enforcement process.

Re-enforcement process for familiar inputs

The Classified Input could be familiar from the past. RE-enforcement checks that by retrieval from Long and Short Term memory. In case the Input is familiar, the existing memory, including metadata about Source, Time and other possibly relevant factors is re-enforced together with the current Input via a learning cycle to either or both memory blocks.

Known entities

Known Entities block relies on memory blocks to check if the current Source is known from the past. Known Entities will run a learning cycle on Short Term memory about new sources, and both memory blocks for existing entities.

Trust level adjustment

Trust level adjustment compares the current Input is compared with existing memories. If there are discrepancies between current Source and other Entities, and/or between the content. Trust levels to either current Source and/or familiar Entities are adjusted. Adjustment happens through a machine learning cycle on Short and Long Term Memory.

Trust adjustment is unique in impacting Long Term Memory direct in case of changing the trust on a Known Entity.

Contradiction processing

Contradiction Processing is activated to check if the Input and/or current Output draft is contradicting Long or Short Term Memory. In layman terms, Contradiction Processing is checking for disagreements, untruths, lies and differing belief structures for both content and source.

Contradiction Processing may impact Known Entities, Short Term Memory, Long Term Memory and the current Output draft. Contradiction Processing can trigger Trust Level Adjustment.

Contradiction Processing for Transaction T_i considers the full history T[i-1]. The NN is calibrated to minimize the possible difference between the new Input I_iand the tentative Output O_i with known history T[i-1]. Specific implementations can weigh information according to its age differently, which results in different degrees of graceful forgetting and learning new information.

Recipient Classification

The tentative Output may be fine tuned according to the expected Recipient (the party providing the Input). This may not always be possible direct, but system can make a difference between an anonymous party and Known entities.

Recipient is classified similar to the Source via Known Entities, and Trust Level Adjustment. Notably, if memories in Long Term Memory are labelled confidential to certain Entities, any Output candidate to a Recipient in the same Trust circle would be treated differently than Recipients not in the same trust circle.

Response / output

The CMLS is an Input – Output system with memory. Accordingly, each Input is matched with an Output. The Output can be empty if the CMLS determines it can’t provide a reasonable answer due to e.g. confidentiality or plain ignorance.

Output is generated gradually between the blocks. It is determined to be ready once the cost functions is acceptably low, or after a timeout. The timeout value can depend on the nature of the Input, but ultimately there is an upper limit to the time CMLS can take to respond to an Input.

Even if the Output is empty, the system may have generated memory imprints of the Input. Over time this is expected to result in learning and CMLS ability to eventually answer.

Conclusion

An architecture for building scalable systems from modules for machine learning systems has been presented. This is similar to the invention of a subroutine: It is not necessary to build a whole system into one model. Instead, working systems can be composed of Domain Specific Models that are set to interact in an efficient and suitable combination.

Balls, cubes and pyramids

Or — how to create concepts?

Consider a robot hand exploring its environment by touch alone.

Add visual aspect, or a camera.

Then there is verbal side, written and spoken, text seen and voices heard.

The hand is given a bunch of objects, the camera is viewing the action and the verbal – bot, chatgpt or whatnot – is chatting about it.

How do they agree on simple things such as “what is a ball”? For a human that is trivial, after the first 2-4 years of verbal and eye-hand coordination learning and development.

But how would anything similar be built for machines?

The robot parts would be facing something like the illustration below – generated by AI, obviously.

What is a concept?

For humans, a concept is an internal representation in the mind of a person. Concepts are formed automatically based on the lifetime experiences and observations of a self-motivated individual. Concepts are shared through communication, language, gestures, shared behavior and thus have commonalities.

Concept autodiscovery in research lit

For robots, for AI, concept formation could be similar. Tenorio-Gonzales and Mordes propose a method for “Automatic discovery of concepts and actions” (https://doi.org/10.1016/j.eswa.2017.09.023” where there is an “intrinsic motivation to discover new concepts, states and actions to learn behavior policies.”.
In other words, a learning system should be programmed with goals and aspirations to drive the machine to discover it’s environment.

Actual representation of a concept is assumed to be a graph. In order for concepts to be compatible between different sensory systems, the concept graphs need to refere beyond the neural representation of sensory input, i.e. the set of visual cues of “roundness” need to be connected, but not the only topic related to a concept “sphere”, or “ball”.

Aino Concept Repository

Aino Repository system will provide a Concept Dictionary so that the components can register and query concepts understood by the part. Concept Dictionary contains Knowledge Graphs with language elements so that a motor unit that recognizes and can manipulate a spherical object would map relevant sensory and motor operations to other representations of the concept “Ball”.

Concept is a (collection of) Knowledge Graphs (KG) in the text below.

Operations:

Add an item in the Concept Dictionary
Find an item in the Concept Dictionary
Modify an item in the Concept Dictionary (new version or modified Concept – Ball, Football)

Find a KG in the KG Registry – keyword based for humans.
Find a KG in the KG registry – content based for AI component / part discovery.
Add a KG – Component into the KG registry. KG – Component relationship is m-n.
Remove a KG – Component relationship.
Modify a KG – new version.

Runtime operations:

Pass a KG from Component A to Receiving Components (a group).
Receive response from Processing Component to Requesting Component.
Search Component Registry based on a set of KGs (smallest set has one KG).

Summary and next steps

There are ways for ML systems and robots to discover concepts on their own, concept auto discovery. There are also ways to represent them and Knowledge Graphs are a good approximation.

Knowledge Graphs – JSON, or XML – can be used as database keys to store and retrieve information.

Ainolabs intends to build a repository system where digital assets (Knowledge) with possible physical products can be stored and accessed for purchase, i.e. “Robot parts market”.

How to build a Barista Robot of parts?

Ainolabs’ search for Holy Grail is to build sufficiently advanced system to can replace an experienced professional. That Holy Grail is challenging for several reasons, but mostly because the way to do that would be to collect an immense amount of multi-channel data about professional environment and then to process that into a single, all-encompassing model.

The difficulty stems not only from the volume of data, but also the nature of it. A professional acts in a multi-sensory environment – 5 human senses, touch, sight, hearing, smell and taste. In working environment sight and hearing are dominant. In addition to these a professional working in an organization needs to be attuned to the social environment – who said what, when. These combined explain why it might take an infant 20+ years to become proficient in e.g. Business Strategy Consulting.

Considering a simpler task: Could a Barista Robot be built out of components?

A Barista Robot would need to interface with customers. A good over the counter customer service representative notices clients as they come in, keeps track of their order so that they’ll all be served in time, notices their demeanor and addresses each customer in an appropriately courteous and polite manner.

A Barista Robot needs also to brew coffee and possibly recommend suitable combinations.

There are great chatbot-style customer service machines. There are also robot arms that can brew coffee. The question is – how to combine the two so that the system Robot Barista would fluently serve incoming clients?

A modular architecture was proposed in The Medium in 2023 – see https://medium.com/@amir.ghm/breaking-the-llms-16k-token-limit-introducing-the-modular-ai-systems-architecture-5a23b37139ac

That might do the trick if it was somehow possible to define the interfaces between parts somehow similar as to APIs are defined as swagger files or Docker components’ dependencies are listed in the manifest.

That however is really puzzling, in biological terms equivalent to surgically attach great barista hands to a well-spoken customer service representative. Besides unethical, that wouldn’t work since a barista’s hand needs to be connected with a barista’s nervous system.

In digital terms – how would one connect a great customer service agent LLM with a great Coffee Making Barista model to a Barista robot arm?

I don’t know, I wish someone did.

Innovation is Change

The Economist put “innovation” to its proper place in a recent column.
“Innovation. Sustainability. Purpose. Yuck”. Indeed. Thanks for them to point that out. Accordingly, the title says “Innovation”, but the subject really is change. Innovation is change, change in making a new product, new feature, new market need – change in supply and demand.

From change management perspective, a new product, new company has the simplest imaginable setting, illustrated below.

Company has conducted rigorous and thorough market research and analysis, and has reached a view of key competitive features to their product, represented by the orange-hued area on the right.

Customers – some of whom were consulted during the market research – had their own views, the jobs to be done. They see features and other products in the marketplace that help with their goals.

The company, in a single-product case a startup, is making a bet that this initial product and its features are sufficient to gain market traction, or sales. That initial step is crucial, subject to much literature and outside the scope here, except from the change perspective.
The company believes something new is coming out, and that the change they make is beneficial to their prospective and expected customers.
Customers see the same, or perhaps just another offering in the same or similar category. For customers to user the new product, they’d need to change their perception and behavior – and there we are, with change again.

That inspiring column by The Economist? – see https://www.economist.com/business/2022/05/14/the-woolliest-words-in-business.

Who can you trust?

Setting computers and technical gadgets aside for a moment – who and what can you trust?

If you trust something to happen – sunrise tomorrow – you perhaps rely on substantial past experience. A philosopher might point out that past experience does not guarantee future: It is somewhat possible that the sun has collapsed 8 minutes ago.

An astrophysicist would counter that based on what we know about stars, ours is still young and has a few billion years left, so no worries.

A meteorologist might ask if you mean that you’ll see the sunrise and would talk about clouds in the morning.

Still the event might not be seen by you if you slept until later and only saw that the sun is up. It must have risen since it is there.

Then what would it mean if you trust another person? That person will always be there for you? Tells you the truth as he/she knows it? Knows the subject, or is just helpfully speculating? How about keeping your secrets?

In brief, humans handle many shades and degrees of trust. Machines handle explicit trust, and poorly at that. For machines ever to be close to us in building relationships and via them collective and individual wisdom – we have a ways to go in technical development.

In the meanwhile – hope you have people you can trust in ways that are important to you.

Procrastination

So many people have written about procrastination. So many people have delayed and postponed writing about procrastination, so why oh why would I bother. Isn’t it better to play one more round of Clash Royale, maybe clean the cupboards, fold linen or whatever.

Professionally procrastination comes into play when you lose inspiration. An inspired mind and body will just keep on going, unless exhausted or otherwise stalled. A hobby, no matter how trivial or silly, is by definition inspirational, so how come the work can become something else?

Occasionally it does, and when that happens you can try to grit yourself through that, push harder, “get things done” – or you can ask “if it seems I don’t want to do this, I’d rather clean the toilet – is this job worth doing”.

I have no answers to that question, only the question. If you are stuck, can’t motivate yourself, and procrastinate over a thing – Was it worth doing in the first place?

Innovation and change

What is an innovation? And what is a change in business? Is it a change if prices go up and then down (discount sale)? Under what circumstances would the customers see price changes as something new, as an innovation? How small a change can be to count as novel?

The answers depend on the specifics of the market situation and customer perception. On the other hand, if nothing changes, the market most likely will not observe novelty or innovation.

We’ll rely on definition bin dissertation by Robert van der Have, “Seeking Speed: Managing the Search for Knowledge Innovate Faster”

“These [innovation] activities span from the creation and development
of new knowledge into an invention, and the process of its subsequent further
commercial development, including application toward specific objectives,
culminating in its practical utilization and economic/commercial exploitation.“

Ideas are not innovations. Rather “an innovation is a new product, service or process that is commercially exploited.” This leaves open the question of incremental innovation, i.e. new product variants or service improvements. For the sake of this discussion innovation and change are considered the same. Thus, an innovation is a completely or partially new product or service that is commercially exploited. This leaves out failed innovations, i.e. ideas that may have been implemented inside a company but were either discarded before commercial launch or ended up as market failures. These need to be taken into account in discussing innovation.

Our interest is in concrete innovation practices, how to effectively develop and deploy new and improved products. At times a sequence of such changes is so profound that the product is actually a completely new innovation. New radical changes are somewhat outside our scope as that is closer to science and research, and as such subject to different mechanisms.

Blue bubbles represent products and services as seen by the customers. Small blue dots represent relevant features of the products. Highlight is used to illustrate how features and products are connected. Please note that there are mutual synergies between products. Naive illustration is a pizza, fork and knife where fork and knife are separate products, both essential and more useful together than one at a time. The author realizes that utensils are often sold together as a unit and begs for forgiveness on this silly illustration.

Orange bubbles, dots and areas represent companies’ view. Orange bubbles represent products, SKUs in concrete terms, and the small dots their components, features, inventions, manufacturing, delivery, service and other aspects that make the product complete. Competitive differentiation lies in these aspects and how they are packaged together as an offering.

Details matter, implementation counts, only products that have been delivered matter. To get from ideas to products to revenue and profit ideas need to be turned into innovations and delivered to customers. That is the interest and we’ll return to that in the next part. As soon as we are done procrastinating.