Open access peer-reviewed chapter

Augmented Post Systems: Syntax, Semantics, and Applications

By Igor Sheremet

Reviewed: April 4th 2019Published: May 6th 2019

DOI: 10.5772/intechopen.86207

Downloaded: 290


Augmented Post systems (APS) are string-operating Prolog-like knowledge representation, affiliated with the “Set of Strings” Framework (SSF). APS descriptive and logical inference capabilities provide natural integration of Big Data with online analytic processing. This chapter is dedicated to strict formal definition of APS syntax, mathematical and operational semantics, and to its most valuable implementational issues, as well as to APS application to Big Data, Internet of Things, cyberphysical industry, and cybersecurity areas.


  • “Set of Strings” framework
  • augmented Post systems
  • string data bases
  • data and knowledge engineering
  • Big Data
  • Internet of Things
  • cybersecurity

1. Introduction

The background of “set of strings” Framework (SSF) [1, 2, 3, 4], developed for adequate modeling of Big Data, is the representation of database as an updated Set of Strings, whose structure is defined by the current metadatabase (MDB), being a set of context-free generation rules in the sense of N. Chomsky [5].

Augmented Post systems (APS), considered in this chapter, are result of deep integration in the one toolkit of two well-known formalisms—classical string-generating formal grammars and string-operating logical systems, proposed by E. Post [6].

By this, the main features of APS are:

  1. Deep connectivity with “Set of Strings” representation of databases.

  2. Deductive capabilities, similar to those, which are inherent to various deductive extensions of the relational model of data, as well as to Prolog and Datalog [7, 8, 9, 10, 11, 12].

The second feature provides selection of not only data, presenting in string databases (SDB) in the explicit form, but also data, which may be constructed (derived) from the explicit data by means of logical inference.

This chapter is dedicated to the formal definition and substantial description of APS and their applications. The terminology and abbreviations are the same, as have been used when describing SSF.

Section 2 is dedicated to the definition of syntax and mathematical semantics of APS, and Section 3 to their operational semantics. Procedural connection to APS is described in Section 4, and flow-processing APS (FAPS), based on the aforementioned connection, are described in Section 5. Implementation issues and applications of APS family are considered in Section 6. The conclusion contains proposals on further development of SSF.

2. Syntax and mathematical semantics of the augmented Post systems

The augmented Post system P=<S,D>is the composition of set S of the so-called augmented, or string (S-), productions, and set D of context-free generating rules, being metadatabase. Following the terminology of knowledge engineering, we shall call P also as APS-represented knowledge base (for short, APS KB).

S-productionσ=<q,d>is couple, the first component of which, named body, has the form of postproduction:


where m0and s0,s1,,smare terms.

Term s0is called header, while terms s1,,smare called conditions. The set of conditions is unordered, i.e., S-productions:




where i1im=j1jm=1mare identical. The aforementioned set of conditions may be empty in general case, i.e., m=0. The S-production with the empty conditions set looking like


is called S-axiom.

Component dof S-production σ=<q,d>is set of rulesγβand is called here variables declaration.

Rule γβdis called variable declaration ofγ, which defines Vγβset of possible values (domain) of this variable:


where Gis CF-grammar, corresponding to metadatabase D. It is essential that there is one and only one variable declaration γβfor every variable γ, having place in the body of S-production. S-production σ,which body s0s1,,smhas no any variables, i.e.,


for all i=0,1,,m, is called concrete S-production (for short CS-production). CS-production:


such that s0V+,d=is called CS-axiom.

S-production σ=<q,d>defines set of CS-productions σ¯in the following way:


where siδare strings in terminal alphabet V, which are created by the replacement of variables γ1,,γkby strings u1,,ukin the same alphabet, respectively; all occurrences of one and the same variable γiin all terms s0,s1,,smare replaced by one and the same string ui.



then S-production σSis correct to MDB D. This means that all terms of every CS-production from set σ¯are words of context-free language LG, and where G is context-free grammar with set of generating rules D.

APS KB P=<S,D>is correct, if all S-productions σSare correct to metadatabase D.

Notion of the APS KB extensional, i.e., set of facts, which may be derived from the knowledge base by means of logical inference, is defined as follows.

LetP=<S,D>be correct APS KB. Consider


i.e., S¯is a set of all CS-productions defined by all S-productions of this knowledge base in the sense (8)–(9).



and extensional ExPofAPSKBP=<S,D>, i.e., set of facts, defined by this knowledge base, is fixed point of the sequence W0,,Wi,Wi+1,, which is in general case infinite:


Evidently, due to P correctness,


and ExPis finite, if their exists isuch, that


If we consider the notion of extensional of APS KB from the linguistic point of view, then ExPis a language in alphabet V,which, according to (15), is sublanguage of LG. At the same time, from the point of view of knowledge engineering, ExPis the join of set of factswW0, which are known explicitly (they are called lower ground facts), and set of facts, which are derived from ground facts and/or another derived facts. As it is easy to see, set of ground facts W0is nothing else, than SDB, while S-productions with nonempty sets of conditions form APS KB intensional, providing the aforementioned inference.

Now we can define semantics of language of queries to APS KB.

Set-theoretical (S-) semantics of this language is similar to S-semantics of SDB query languages:


where W¯is APS KB extensional and Iis set of facts, which actuality check is the purpose of the query.

Here we shall use the simplest query language, being set of couples <s,d>, where sis term and dis its variable declaration.

Mathematical (M-) semantics of this language is based on (14) and the following evident equation:


Here, <<sd>,D>is correct APS KB, which one-element set of S-productions—selection criterion—contains S-axiom, defining set of facts, which may belong to ExP.

Example 1. Consider metadatabase from Example 2 of the chapter of this book, describing SSF. We shall add to it the following three rules:

<fact>SENSOR<i>LOCATEDATAREA<name of area>.

Facts like SENSORNNAT17.00NORMALcontain information about sensors, which gather and send to the fusion center the data about the state of the atmosphere in the surrounding area. Facts like SENSORNNLOCATEDATAREAYYcontain information about sensors’ location.

Consider APS knowledge base P=<S,D>, where MDB Dwas described higher, and Scontains following S-productions:

{a<name of area>,s<state>,e<i>,

As seen, σ2σ5are concrete S-productions, and set W0, corresponding to this APS KB, consists of ground facts, being headers of these CS-productions.

Query, whose purpose is to get information about smoked areas, may be as follows:

<AREAzIS SMOKEDATt,z<name of area>t<time>>,

while query, whose purpose is to get information about sensor FYlocation, may look like

v<name of area>>.

Evidently, this knowledge base extensional is



3. Operational semantics of APS

The background of operational (O-) semantics of the considered query language is the so-called S-unification, which is generalization of well-known unification, introduced by J. Robinson in [7] and used in the resolution procedures, developed for the first-order predicate logic. (In fact, S-unification is a strict generalized definition of heuristically described unification, which, as it was shown in [1], is incorrect in the case of multiple occurrences of at least one variable to one of the unified terms.)

We shall consider basic concept of S-unification and then describe operational semantics of used query language, corresponding to (11–18) and defining main features of the controlled logical inference of answers to queries to APS KB.

Let <s,d>be a query and <s0i,di>is couple, formed from S-production:


As may be seen easily, the answer to <s,d>may contain headers of the CS-productions, being concretizations of σi, i.e., belonging to set σ¯i, if


This condition is equivalent to the existence of solution of word equation on context-free language (WECFL):


(for correctness, sets of variables of terms sand soiwould not intersect, that is, simply implemented by variables renaming, if this condition is not satisfied).

Result of WECFL (21) solution is the new declaration of variables of S-production σi, “concreted” in accordance with query <s,d>. This concretization will be denoted as disois,d;it is obtained by replacing variable declarations βjiof terms0iby new ones β¯ji, which are more informative (“concrete,” “precise”), regarding initial declarations in di.This concretization is adequate to query <s,d>in such a way, that CS-productions, whose headers may enter the answer to this query, may be created at the following steps of inference.

Logics of S-unification are illustrated in Figure 1, followed by Example 2.

Figure 1.

Graphical illustration of S-unification.

Example 2. Consider query <s,d>, where

d=e<name of area>t<time>,

and S-production σ1from Example 10, where

d1=a<name of area>s<state>e<i>t<time>.

According to (21),

s01d1=AREA<name of area>IS<state>AT<time>,
sd=AREA<name of area>ISNORMALAT<time>,
={a<name of area>,sNORMAL,e<i>,


S-unification provides opportunity of answer derivation by controlled logical inference, which is implemented by the axiomatic system, which contains only three inference rules—top-down successful S-resolution, top-down unsuccessful S-resolution, and bottom-up S-resolution [1, 2]. Here we shall describe more natural and understandable procedural representation of the considered O-semantics, which fully corresponds to the aforementioned axiomatics.

This representation contains two recursively interconnected algorithmically defined functions.

The first of them Qhas two input variables sand d, which values represent query <s,d>, and one output variable A, which value, returned after Qtermination, is the answer to this query, as defined by (17).

The second function RBhas also two input variables, φand d, which values represent, respectively, the so-called residual body RBof S-production (subset of body of S-production, including terms, which until current call were not used for the derivation of new queries) and variable declaration, obtained after previous steps of logical inference. Function RBreturns set Dof variable declarations, each corresponding to one element of answer to the query <s,d>, where sφis the next term, extracted from RB. By this, input variables declaration dis made “more precise.”

Text of function Qis as follows:

1Q:functionsdreturnsA;2variables(ss0term,d,dδδdeclaration,φsetof terms,3Asetof words  initial)local;4variablesSsetof productionsDsetof contextfree rulesglobal;5do<s0,φ,d>S;6δds0s,d;7ifδ8thendoδRBφδ;9A:s0δ;10endδ;11endS;12endQ

As seen, lines 2–4 contain declarations of variables, used lower in the function body. It is essential that there are two variables declared as global, namely, S (set of S-productions) and D (metadatabase). Nevertheless the last one is not used in the body, it is presumed that MDB is used inside S-unification (line 6). The initial value of local variable Ais an empty set.

The main part of Q is loop (lines 5–11) on S-productions, entering set S and represented in the form of triples <s0,φ,d>, where s0is header; φis body of S-production, being set of terms; and dis its variable declaration. The first operator inside loop (line 6) implements S-unification of query <s,d>and couple <s0,d>. If this S-unification is successful (i.e., its result is not a special value ), then loop on elements of set of variable declarations, obtained by function RB, is executed (lines 8–10). The values of input variables of function RB are φ(set of terms, entering body of the current S-production) and δ(result of the aforementioned S-unification). Each δ, being set of variable declarations of the form γu, where uV, is used for making facts s0δby substitution right parts of those declarations to term s0, and all such facts are joined to the output set A(line 9).

Text of function RB, in turn, is as follows:

1RB:functionφdreturnsD;2variables(φsetof terms,ddeclaration,sterm,wfact,3Dsetof declarations  initiallocal;4variablesDsetof contextfreerulesglobal;5ifφ=6thenDd;7elsedosφ;8dowQsd;9D:RBφsdsw,;10endw;11ends;12endRB

Here lines 2–4, as in the text of Q, are declarations of variables, used in the RB body. RB execution begins by check, whether set φis empty (line 5): if so, search of terms of S-production body is already finished (or, as partial case, it is empty because S-production is axiom), and the returned value is one-element set d, where dis the same, as input value. Otherwise loop on terms, entering residual body, is executed (lines 7–11). Every selected term sis used in query <s,d>to KB <S,D>, which is processed by recursive call Q (s,d), and loop on all facts, entering answer to this query, is executed (lines 8–10). The body of this loop contains single operator (line 9), providing accumulation of set Das a result of recursive call of the same function RB with the first input variable being φs(i.e., set of terms of the body of the processed S-production, reduced by extraction of the processed term s) and the second input variable value being dsw,(i.e., S-production variables declaration, made “more precise” according to result of S-unification of query <s,d>and fact w).

As it is easy to see, execution of loop sφprovides generation of n!sequences of terms of body of S-production, i.e., all possible permutations of these terms.

Let Qsdbe finite result of function Q application to the query <s,d>and APS KB <S,D>. There is theorem on the correctness of the described operational semantics of the augmented Post systems [2].

Theorem 1. Qsd=ExSDEx<sd>.

In fact, Q and RB collaborative operation provides top-down derivation of the set of CS-productions, involved in creation of non-ground facts, which enter the answer to the query <s,d>, in full accordance with bottom-up definition of M-semantics of APS. (Ground facts are simply selected by one-step application of Q and RB.)

As seen from this description, logical inference is implemented in a wavelike manner. “Direct waves,” including generated queries, which contain “more and more concrete” variables declarations, are “spreading” until S-axioms are reached. This leads to the creation of CS-productions, and “back waves” start until a new set of facts, being answer to some intermediate (or, finally, initial) query, is assembled. Both direct and back waves meet one another and “interfere,” producing new queries. As may be seen, APS operational semantics corresponds to the universal mode of inference (all facts, being goal of the query, are derived), unlike existential mode (anyone fact from the possibly multifact set), which is inherent to known resolution procedures. By this, full accordance of APS O-semantics to its S- and M-semantics is achieved.

4. Procedural connection to APS KB

One of the most important features of any knowledge representation model are embedded tools of the procedural connection, providing interaction of “rigid” (non-updated by knowledge engineers) software modules with inference process and data exchange between inference engine and the aforementioned modules.

Background of the procedural connection to APS knowledge bases are program (P-) productions.

P-production is couple <s0p,d>,where term s0is header and set dvariable declaration; s0anddare declarative component of the procedural connection. Symbol ""is the divider, distinct from "", and pis the name of the connected program (software module), which has its own extensional WpEx<<s0d>D>LG.

Couple <s0,d>is called cover of the connected program p. In general case one and the same program may be connected to APS KB by several P-productions with different covers.

Example 3. Program named MULT for multiplication of integer numbers may be connected to knowledge base by P-production:


MULT extensional is an infinite set containing strings like 11=1,05=0,3111=311, etc.

Let Πbe the set of names of programs, connected to APS KB. The extensional of this KB will differ from (12) to (14) by the only feature; instead of (12) the following is used:


i.e., W0is a join of extensionals of all connected programs and set of ground facts, being heads of CS-axioms. Until general case will be discussed, we shall consider programs with the so-called static extentionals, whose basic feature is Wp=constfor all period of query processing.

Concerning operational semantics, it is sufficient to extend function Q by operators, providing call of the “perspective” connected programs via unified application programs interface and joining resulting sets to the accumulated answers. The extension is as follows:


As seen, after search on S-production set (lines 5–11), similar search on P-production set would be executed (lines 12–16). If S-unification of query <s,d>and program pcover <s0,d>is successful, program call is executed, and its result—set of words, denoted ps0δ—is joined to the accumulated set A. This fully corresponds to (22).

As seen, search on the set Sis nondeterministic (there is no any predefined order on this set, except S-productions processing before P-productions; but in general case, this ordering, of course, is not mandatory). This obstacle creates some difficulties in implementation of calls of the connected programs, when input data are less informative, than it is necessary for computation (e.g., MULTab=c{a<integer>b4c<integer>}. By this, result of such call may be infinite set, even if there would be an opportunity to implement this operation using tools for N-facts processing, developed in [1, 2, 3, 4].

To avoid such difficulty, it would be reasonable to implant to S- and P-productions some information, which may cut off “dangerous” branches.

There is a simple tool for logical inference control within APS knowledge representation. Namely, variables, which the declarations after S-unification would have right parts, consisting only of terminal symbols (i.e., there would be no incomplete information, associated with non-terminals, inside these declarations), are marked by point over arrow in the initial descriptions, having place in P- and S-productions. Such variables are called complete. In example 3 declarations of variables aand bwould be ȧ<integer>,ḃ<integer>, so query


while inference will result in program MULT call, while query

<xa=s,x<integer>a135s<integer>>will not, because of information incompleteness of the declaration of x, which is a complete variable. Negative result of S-unification in the case of such incompleteness is, as higher, denoted by symbol ∇.

Let us consider now the general case, where extensionals of programs, connected to APS KB, are not static, i.e., may vary while answer derivation. The simplest examples of such kind of programs are database management systems.

To connect DBMS, operating key-addressed databases, it is sufficient to join to APS KB P-production:


for queries (variable k, denoting key, is complete), as well as


for update operations (both variables kand d, corresponding to key and data, are complete).

DBMS, operating databases with symmetric access, may be connected in a following way. In order to minimize redundant search while queries processing, some substrings of facts may be declared indexed (used for creation and maintenance of dynamic search trees, as it is described in [2, 3]). So some covers of the relational DBMS, corresponding to some subsets of DB, may contain complete variables, with declarations like γ̇u, where → u defines domain of the indexed attribute of the relation. Covers, providing inclusion to SDB, contain only complete variables. (If DBMS operate SDB with incomplete information, any variable may be incomplete.)

5. Flow-processing APS

Described approach to the procedural connection makes easy integration of the distributed hardware components, linked by the computer network, into a single system with the single operation process. This may be done by procedural connection of the corresponding hardware drivers and implementation of the so-called flow-processing augmented Post systems.

FAPS KB includes along with S- and P-productions also the so-called flow (F-) productions having the form of


where terms s0, called activator, and sm+1, called actor, contain only complete variables and symbol “” (inverse to “”, used in S-production) divides activator and body, including, along with actor, conditions s1,,sm. This structure is rather usual as well as operational semantics of F-productions, which is defined by function F, similar to Q and RB, and interconnected with them into integrated algorithmics.

We assume that F-productions are represented in the knowledge base as couples <s0,φ,s,d>, where s0is activator, φis set of terms-conditions, and sis actor, while dis variable declaration. By this, APS KB contains three types of objects, corresponding to three types of productions:

  1. <s0,φ,d>, whereφis set of terms (S-productions);

  2. <s0,p,d>, where pis string, being name of the connected program (P-productions);

  3. <s0,φ,s,d>(F-productions).

All these objects are accumulated to set S, which, along with metadatabase D, present in the bodies of functions Q, RB, and F as global variable.

Function F is as follows:

1F:functionw;2variabless0stermdδδdeclarationwwordφsetof termslocal;3variablesSsetof productionsDsetof contextfree rulesglobal;4do<s0,φ,s,d>S;5δds0w,;6ifδ7thendoδRBφδ;8Fsδ;9endδ;10endS;11terminate;12endF

As seen, the search on F-productions set (lines 4–10) provides selection of such F-productions, which are activated by input message w. The set of conditions of every activated F-production is interpreted as residual body of S-production (lines 7–9). Every variable declaration δ, obtained as a result of this interpretation, is used for the creation of new message by substitution of δto actor s. This message is used as input value for function F recursive application. So the wave of messages, triggered by initial message, is generated, modeling well-known blackboard architecture. This wave propagation of any programs (including DBMS, software/hardware drivers, providing activation and operation of various devices, as well as networking middleware) may be applied. Operator terminate (line 11) stops F execution, so no return to the parent call is performed.

Various practice-oriented dialects of APS, providing various effective technologies of non-procedural programming (multiactivating, triggers, terms with negators, decision tables, etc.), were developed; also described operational semantics of APS and its extensions were redefined for highly parallel hardware environment [1, 2]. Taking into account nondeterministic nature of operational semantics of APS family, some additional tools for logical inference control were introduced in [1, 2].

Let us consider now the main features of APS application to the most advanced areas—Big Data, Internet of Things, cyberphysical industry, and cybersecurity—forming the future digital economy information infrastructure (DEII) [13, 14]. In all aforementioned applications, APS knowledge representation plays interconnecting and flexifying role, providing fast integration of various heterogeneous systems and fast adaptation of their operation logic to the highly volatile environment. In fact, APS simplify to the maximally possible level the most complicated problem of interoperability of any a priori developed systems; APS KB is nothing but “glue,” which in the simplest regular way integrates them together.

6. Implementational issues and applications of APS

DEII is the result of integration of three basic paradigms: network-centricity, Big Data, and Internet of Things (Figure 2).

Figure 2.

Basic paradigms of digital economy information infrastructure.

Network-centricity provides interconnection of any subjects of digital economy (DE), integrating human society and technosphere into global technosocium, operating as a global megasystem for the mankind wealth and prosperity. Big Data provides storage of great amounts of data from multiple heterogeneous sources and use of these data for rational everyday operation of the aforementioned megasystem and its permanent improvement. The Internet of Things, whose more precise and correct name would be Internet of Devices, provides creation, development, and operation of the aforementioned future global technosphere, including cyberphysical industry, based on additive manufacturing technologies, deeply robotized smart logistics, smart energy generation and delivery systems, and life resources infrastructure (food, water, living houses, etc.), as well as digital banking and finance infrastructure. All these segments, containing many billions of interconnected devices, provide implementation of such ambitious initiatives as Smart City and Smart Nation, inevitably leading to the Smart World as a well-understood and achievable goal of mankind evolution.

The most complicated problem to be solved for DEII creation and development is providing its flexibility, i.e., rapid correction of operation logic of various DEII elements and sets of elements to the constantly occurring changes of the environment, as well as to changes in our knowledge about nature, human society, and technosphere. It is evident that sufficient flexibility of DEII may be achieved only on the background of knowledge engineering, leading to the knowledge-based digital economy.

As shown in Figure 3, every subject of the DEII (no matter, human, or device), being associated with unique address of the address space of global information infrastructure, is supported by local knowledge base (LKB), applied by knowledge interpreter-corrector (KIC) for processing of input information flow, as well as local database, used and updated while aforementioned processing. Here LKB may contain not only “soft” component, i.e., rules, defining logic of input messages processing, but also “firm” component, i.e., “rigid” (nonmodified) software modules and systems (up to DBMS clients and servers), connected to LKB by the common interface, and called while rules interpretation (their sets are marked CP, e.g. Connected Programs, with lower indices).

Figure 3.

Linearized representation of the knowledge-based digital economy information infrastructure.

The described approach may be effectively implemented, if there would be some unified data item, providing unified representation of operation logic of any system, joining any set of DE subjects.

The possible practical outcome of the SSF is the creation of toolkit, providing the described higher approach to DEII implementation upon string as the aforementioned unified data item. This outcome in the integrated form is placed in Figure 4, where APS family of string-operation knowledge representation models is explicated.

Figure 4.

Integrated representation of APS family of knowledge representation models.

The operational semantics of the simplest practically used model from this family, whose background is FAPS, is shown in Figure 5.

Figure 5.

Basic operation semantics of multiactivated flow APS.

As seen, input messages, which are transported to the local subject of DEII by means of network infrastructure, are recorded on the external blackboard (in fact, there are two blackboards used: external, for messages from the external sources, and internal, for messages generated while current external message processing). Every next message (string) wis navigated by special associative index (binary/ternary tree) to the activators soi. Selected activators are used for solving the corresponding WECFL (to simplify variable declarations and accelerate messages parsing, the so-called ultragrammars were introduced in [1, 2]), and if solutions do exist, they are transferred to the rest part of the F-production, in which the terms may provide access to databases with symmetric access (DBSA) or key-addressed databases (KADB) SQL-like or NoSQL queries and updates; check database integrity before updates (by applying constraints defined by the corresponding sets of productions); activate devices by their drivers, connected to LKB by proper P-productions; perform some nontrivial processing by connected “firm” software products; and also send messages, which are created strings to the internal blackboard or, by lower-level communication software, to another local subject (i.e., their external blackboards).

As seen, described dialect of APS family provides flexible implementation of operations, necessary for Big Data and Internet of Things, as well as its cyberphysical manufacturing segment, i.e., industrial Internet of Things. In the simplest case, strings, being transferred to 3D printers by their driver’s calls, may be STL files containing layer descriptions of the printed material objects [15].

The main feature, which is, in fact, core for all APS family, is “additivity” of the local knowledge bases. Namely, the occurrence of any new device or human, generating earlier unknown and unprocessed messages, is supported by addition to LKB of subjects (again devices or humans), communicating with source of such messages, new elements with activators, providing initiation of their processing. Such occurrence results, finally, in creation and sending new messages to another subjects. Such “additivity” may be called “vertical” (new types of messages are supported by the addition of new F-productions with new activators to LKB). However, horizontal “additivity” is also supported, when new F-productions with proper activators are added for internal messages, which are sent by some modules of LKB while processing external messages. By this, internal messages processing becomes “deeper.” As seen, FAPS provide sufficiently regular and refined technology of the distributed system software development and debugging.

Let us underline that unlike well-known tools from the object programming area, which also support interconnection between modules by messages and use blackboard for such processing, in the case of flow APS, new message receivers are not known, and thus, there is no opportunity for its determination in the text of the program module. By this, APS-based knowledge engineering approach to software making principally differs from object programming as well as any other approaches from the more or less procedural programming.

Let us pay some attention to such important area of flow APS application as cybersecurity, which is critical for DEII normal operation.

Dynamically extended spectrum and complexity of cyberattacks, implementing today advanced persistent threats (APT), have led to the necessity of development of more and more sophisticated tools for early recognition and prevention of APT. The most efficient of such tools are based on the security information and event management (SIEM) paradigm [16, 17, 18, 19], whose background technologies are deep packets inspection/deep packets processing (DPI/DPP) [20, 21] and data leakage prevention (DLP) [22, 23]. The first operates flows of bit packets, providing their fusion in order to recognize elaborately covered signatures of known cyberattacks, while the second operates usually traffic from the application level of the OSI model. However, both operate strings (no matter, bit, symbol, or combined), and by this reason the described higher FAPS are the perfect tool for the SIEM implementation, especially at security operation centers (SOC), providing fusion of the primary data, passing to SOC from large amount of software and hardware traffic sensors from the security perimeter of the protected system. (Such sensors may be also effectively implemented on the FAPS background.) It is very important that SIEM is solidly based on three pillars: Big Data, which are stored fragments of the network traffic, caught at many network links in a 24 × 7 regime; data mining; and knowledge engineering. This area is the hardest known case of data mining and knowledge-based technologies application: cybersecurity knowledge engineers, by analyzing accumulated enormous volumes of primary data items, must quickly understand signatures of the prepared or already performed cyberattacks, developed by the smartest people from the global cybercrime and states special services teams; and rapidly correct LKBs or extend them by new sets of productions, providing early recognition and neutralizing such attacks. This combination of very large volumes of data and knowledge with hard real-time regime of SOC operation (e.g., SOC of the Russia largest financial group Sberbank neutralizes daily over 14,000 cyberattacks from all over the world [24]) makes cybersecurity application of data and knowledge engineering the most important from both practical and theoretical points of view.

7. Conclusion

The described “Set of Strings” Framework is based on integration of three research areas—knowledge/data engineering, theory of grammars, and information theory—which until now are very close but rather isolated from one another. Synergetic effect, which may be the main result of such integration, would be useful for the development of the unified theory, necessary for convergence of Big Data, data mining, artificial intelligence, and Internet of Things. Such convergence is extremely needed for solution of all spectrum of problems of digital economy.

The most valuable directions of the SSF future development may be the following:

  1. Metainference and machine learning in APS KB.

  2. N-facts concretization in SDB with incomplete information (SDBI) and APS KB, using functional dependencies, associated usually with the relational data model.

  3. Contradictory SDBI and APS KB management.

  4. SDBI and APS KB with metadatabases, describing two- and three-dimensional objects.

  5. SSF ideology and techniques transfer to the numeric data with the help of the SSF-associated theory of recursive multisets and multiset grammars/metagrammars, developed for the solution of problems of planning and scheduling in digital economy [25, 26, 27, 28].

  6. Hardware implementation of key elements of SSF in the data flow and other high-parallel computer environments.

The author is highly interested in the cooperation with computer scientists and engineers, who may spend some research effort participating in the development of the listed directions and the SSF at all.


The author expresses gratitude to the editor for useful advices. The author is grateful to Prof. Noam Chomsky for words of support to the development of the “Set of Strings” Framework. The author is also thankful to Prof. Jeffrey Ullman for useful remarks on the current state of the theory of grammars.

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Igor Sheremet (May 6th 2019). Augmented Post Systems: Syntax, Semantics, and Applications, Introduction to Data Science and Machine Learning, Keshav Sud, Pakize Erdogmus and Seifedine Kadry, IntechOpen, DOI: 10.5772/intechopen.86207. Available from:

chapter statistics

290total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Serialization in Object-Oriented Programming Languages

By Konrad Grochowski, Michał Breiter and Robert Nowak

Related Book

First chapter

Concepts, Historical Milestones and the Central Place of Bioinformatics in Modern Biology: A European Perspective

By T.K. Attwood, A. Gisel, N-E. Eriksson and E. Bongcam-Rudloff

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us