Intensive training of combinatorial agents

The sizes of data domains used for training in practical applications are not always massive, and even when they are, cutting the data volume down is tempting because of considerations of time saving as less data means less processing. Different approaches can be used to reduce the amount of required data significantly and speed up the training accordingly.

The features on the side of the combinatorial agent that affect the speeding up most of all are generalization and meta-learning.

Generalization is the ability of the learning agent to generate utile output for input cases not presented in the training data sets. Generalization speeds up learning as it reduces the amount of required training data.

Meta-learning is the ability of the learning agent to choose the right way of generalization on new problems that have something in common with the known problems. It takes into account various metadata like properties of the problems, performance measures of decision-making procedures or patterns derived during the training.

The meta-learning function of combinatorial agent allows to build the process of its training and using in a brand-new way in order to cut the volumes of data necessary for training again. The basic routines of the corresponding train-and-use method include training on examples, meta-supervision, and applied search.

Training on examples is a routine of exposing ready cases of problems with solutions often marked up by utility function to the learning agent. Training on examples is widely used in machine learning and generative AI.

Meta-supervision is a routine to control learning through adding more problems and examples to the training program that can affects agent’s metadata on the basis of diagnostic information transmitted by the agent that relates to its meta-learning state. To run the meta-supervision, the agent must be trained not only to solve problems but also tell how it finds the solutions. This metadata-related information can be used then to purposefully correct agent’s decision-making methods by making changes into the training program.

Applied search is the target routine where training is combined with applying the agent to solving practical problems. No more training examples with ready solutions are provided to the agent. Instead, the agent gets access to data related to problems and is expected to produce utile solutions. It coordinates in a dialog regime with a human supervisor who can add or weaken restraints to the problem, the way how it must be solved, and the output representations for the solution. The function of examples for training shifts as a single case can demonstrate a bunch of terms and problems simultaneously in contrast to training where many cases are needed to demonstrate just one problem. This is the training at the fastest speed as the agent continues to acquire more and more metadata though the supervision is no longer focused on that. Only when wrong understanding is diagnosed in the agent, the workflow returns to the meta-supervision routine to bring the focus back to training for some while.

A training program in the train-and-use methodology expands the application domain by including more and more problems on different abstract levels in order to reach the steady operation in the applied search routine. This aim forms specific requirements to the AI learning algorithms and the training routines and needs external training performance indicators. These indicators relate to the characteristics of the training problems that can be solved by the agent.


Developers can manage combinatorial AI via training programs. The training can start from larger volumes of non-specific data for e.g. reading and writing in XML formats, solving optimization problems and using external computation utilities and then continue to application-specific data extracted from the developers' ready models and regular practices.

The main results of the train-and-use are:

  • Solving applied subject-specific and abstract problems with intellectual routines made by combinatorial agents.
  • Sharing and replenishing the knowledge of the trained agent in the database format for further use and training.
  • Getting information about the way the problems get solved by the agent, which can be used for clarifying problem statements and supplementing the training program.

Solving each problem in the applied search leads to additional training of the agent on the experience of both acceptable solutions and negative results. The train-and-use is flexible as the longer the agent works the less difference the order of the problems makes.

The train-and-use change the purpose and the way to solve problems with AI and communicate to AI. The value of that is not only in solving problems but also in better understanding of the subject on both sides of the communication.


If our approach piques your interest, we invite you to a detailed discussion of the methods and a pathway for integrating the intensive AI in your technologies to face the challenges you encounter.

© 2024 apparatus scientia d.o.o