@q file: intro.w@> @q% Copyright Dave Bone 1998 - 2015@> @q% /*@> @q% This Source Code Form is subject to the terms of the Mozilla Public@> @q% License, v. 2.0. If a copy of the MPL was not distributed with this@> @q% file, You can obtain one at http://mozilla.org/MPL/2.0/.@> @q% */@> @** Summary of Yacco2's user library.\fbreak These are the building blocks of various definitions for all derived code emitted from Yacco2 with their runtime objects. All code blocks are genereted by |cweb|'s ctangle program drawn from their source file names having an extension of ``.w''. Points 8 and 9 are created from the |thread.w| source. The following are the outputted files:\fbreak \ptindent{1) |yacco2.h| --- common definitions for all implementations and use} \ptindent{2) |yacco2.cpp| --- common parts of yacco2's library created from this document} \ptindent{3) |wthread.cpp| --- thread components} \ptindent{4) |wrc.cpp| --- raw characters mapping into terminals} \ptindent{5) |wset.cpp| --- set routines for the finite automaton tables} \ptindent{6) |wpp_core.cpp| |wproc_pp_core.cpp| --- include code for generated pp threads} \ptindent{6.5) |wpp_core.cpp| thread, while |wproc_pp_core.cpp| procedure call version} \ptindent{7) |wtok_can.cpp| --- specialized token containers: reads chr from file and string} \ptindent{8) |war_begin_code.h| --- arbitrator's start code} \ptindent{9) |war_end_code.h| --- arbitrator's end code} \ptindent{10) |wtree.cpp| --- tree container, walkers, and functors} The 3 files generated outside this environment and referenced within Yacco2's library:\fbreak \ptindent{1) |yacco2_k_symbols.h| --- lr k terminal definitions} \ptindent{2) |yacco2_characters.h| --- raw character terminal definitions} \ptindent{3) |yacco2_T_enumeration.h| --- enumeration of symbols} Some Yacco2 memorabilia:\fbreak \ptindent{1) yacco2 --- library namespace} \ptindent{2) directory --- ``\Whereyacco2/library''} \ptindent{3) |wlibrary.w| --- yacco2's |cweb| document} \ptindent{4) Look at the |Global macro definitions| and |Typedef| for limitations} At the end of this document is a |Notes to myself| section that you should read. These are a quasi set of ramblings on old / new reasons for changes, whys of the current implementation, and items for future redress. Please have a browse during this document reading. The notes are in an order of my programming thought zones while being developed. @*2 Introduction to Yacco2's parse library.\fbreak Welcome to |Yacco2|'s library. This is the |oracle| of |typedefs|, |macros| of assorted functionality, and |constant| definitions. By having a common source code generator of definitions for the library, it should make this project easier to maintain and evolve. Instead of using the basic type definitions of the \CPLUSPLUS/ language, I felt the |typedef| facility will make it easier to port the project onto another platform of nbit evolution --- ahh the crazy world of bit envy 16...32...64 etc. Any inconsistency within the c language like |char| and its smorgasbord of flavors should be minimized by this approach to handling the pain-no-gain syndrome of supported systems. Now I'm a fan of macros as it gives a nice way to dynamicly generate source code patterns. Unfortunately, the c language preprocessor was a hack that people are still living with while the PDP11 macro assembler facility of bygone years from now defunct Digital Equipment Corporation had class. All this to say, I am still using macros but trying to restrict their use. Within this project, macros provide the tracing facilities for the emitted grammar code, and the library's debug version. From experience with this library first written in \CPLUSPLUS/, various refinements to the tracing output were needed. When one parses a large file having possibly hundreds of threaded grammars running dynamically within a session, if all tracing classifications are turned on, the traced session output can get rather large. Message tracing alone is very verbous but at least you have options to track down problems. This was very helpful when I relied on Microsoft's take on messaging. When threads become latent due to dropped messages (unexpressed limitation of the number of messages allowed in their Window queue), at least I could re-evaluate how I would roll my own. Well you'll see later how I re-implemented message queues with mutexes. Now |cweb| provides various flavors of macros. Some macros use parametric substitution per c source line. A great feature of |cweb| is its code snippet insertion facility. The description of the code section provides a better reading of the code. One is not caught up with the details but the intent. I consider it a version of pseudo-programming in the real or is it a real coding in the pseudo? So the following is an re-engineering of Yacco2's library from \CPLUSPLUS/ code to |cweb|. @*2 Using Yacco2.\fbreak Where are those damn objects? Make sure your \CPLUSPLUS/ compiler and linker are given the directions to where the |#include "yacco2.h"| file resides and Yacco2's appropriate object library. For example, the Yacco2 environment to use and link to are as follows:\fbreak \ptindent{\Whereyacco2/library - where the include file resides} \ptindent{\Whereyacco2/library/xxxx - where xxxx is debug or release for the object library} Within the ``Visual studio \CPLUSPLUS/'' product, one can provide the appropriate directions within the project properties and preprocessor symbol definitions used to control code inclusion. One can also create an Environment variable in NT by going to the `System panel', choosing `System properties' followed by `Advanced properties'. You can possiblely use `Yacco2' and `Yacco2lib' as the variable names: it's to your taste. The HP \CPLUSPLUS/ product and linker can be expressed by command line parameters. @*2 Overview of Yacco2's components:.\fbreak Still under thought construction --- procastinating am i...? @*2 Rules of the name.\fbreak There are not too many dictates. I try to give meaningful names to the components, be it methods, variables, or symbols. I lean a little too far in verbousity as in the Germanic description given to a symbol's name. Use of |cweb| will lower this trait. Cryptic names don't have a long life in their intent: future readings of the code usually requires a rebuilding of code comprehension. Typical coding comments are not enough. There are usually unspoken premises that trip up the programmer. This is why, for me, `Literate programming' is the only way to go with its adjunct |mpost| diagrams (Meta Post). I say this in an asymtompic way as perfection is the carrot before the coder striving for a moment's perfection that is just a drop in the programming space. Too many programmers are stuck in the one dimension of code: `just get it done' that becomes a debugging issue of learning that does not get reframed into documentation. Judge accordingly my attempt at the how,why,when,where,what,and whom are expressed. This is a quasi diary of my internal debats, mistakes, and evolutionary corrections in comprehension to programming Yacco2. Rule number one: Use the imperative verb form to express a method name. For example to read or set a variable named xxx, the imperative actions can be |read_xxx| having no parameter, and |set_xxx| with it's appropriate parameter. From experience, overloading the method name by presence or absence of a parameter tempts error. I am more disciplined on the setting of variables due to past trapings. Regards to reading of a vaiable value, I'm more relaxed as you will see some variations. You'll find for efficiency reasons, I access the variables directly instead of thru the wrapper function: yes I know the arguments of ``OO'' but inlining in my opinion got fumbled. @ Legend of terms.\fbreak \ptindent{th - thread} \ptindent{pp - grammar requesting parallel parse} \ptindent{ar - arbitrator} @*2 The preprocessor coding game.\fbreak To cope with variations in source code, the \CPLUSPLUS/ preprocessor's \#if directives are used. The |#if|'s constant expression is used where appropriate values are tested using the |#if| / \#elif preprocessor expressions. The |yacco2_compiler_symbols.h| file contains the 2 preprocessor symbols for compilation of \O2: |THREAD_LIBRARY_TO_USE__| --- Pthreads(0) or Microsoft(1) thread library, and |THREAD_VS_PROC_CALL__| --- run by thread(0) or by a procedure call(1). |THREAD_VS_PROC_CALL__| is an optimization attempt or a bailout when the platform being ported to has threading problems. Please see ``Notes to myself'' as to why it's been removed. Initially the below symbols were used to control the inclusion of tracing code by the macro preprocessor. This really was a pain-in-the-???. As the number of options increased, how many \O2 library variations do u need? So now there are only 2 \O2 library flavours: clean-no-chafe tracing code and all-u-can-trace. To achieve this binary approach to \O2 libraries, {\bf instead of conditionals}, {\bf global tracing variables} are now used that are checked at runtime to exercise their tracing behaviors. The run program that uses the \O2 library can use the |YACCO2_define_trace_variables| macro to generate the tracing variable definitions. U can still do it the hard way by individually coding each definition but why not use this short cut? So far these tracing global definitions take a binary value of 0 indicating do-not-trace while 1 means use it. There is a very slight run speed bump having their runtime presence within \O2's library and whether it's nobler to trace or not...but their benefits outweight their hiccups. One can turn on or off there use anywhere through one's code. @^Directory of variable variables controlling various macros@> Directory of variables:\fbreak \ptindent{|YACCO2_T__| --- trace terminal when fetched} \ptindent{|YACCO2_TLEX__| --- trace macros of emitted grammar: rules and user emergency macros} \ptindent{|YACCO2_MSG__| --- trace thread messages} \ptindent{|YACCO2_MU_TRACING__| --- trace acquire / release of trace mutex} \ptindent{|YACCO2_MU_TH_TBL__| --- trace acquire / release mutex of thread table} \ptindent{|YACCO2_MU_GRAMMAR__| --- trace acquire / release each grammar's mutex} \ptindent{|YACCO2_TH__| --- trace the parse stack: fsa and syntax directed activities } \ptindent{|YACCO2_AR__| --- trace arbitrator procedure} \ptindent{|YACCO2_THP__| --- trace thread performance} \ptindent{|VMS__| --- Alpha VMS port to correct their Pthread limitations} \ptindent{|VMS_PTHREAD_STACK_SIZE__| see bug's talk and |yacco2_compile_symbols.h|} They are enrobed by namespace yacco2. To set the trace variable be sure the namespace is delared: either explicitly as in: \fbreak \ptindent{|yacco2::YACCO2_T__| = 1;} or implicitly by a ``using namespace yacco2;'' statement somewhere preceding the assignment:\fbreak \ptindent{using namespace yacco2;} \ptindent{...} \ptindent{|YACCO2_T__| = 1;} @*2 Thread library use.\fbreak |THREAD_LIBRARY_TO_USE__| indicates what thread library to gen up. {\bf It is a macro conditional symbol}. There are currently 2 libraries supported: Microsoft's thread support and the |Pthread| POSIX library. Both libraries have been used. The Pthread library of 32 and 64 bit flavours was tested on HP's VMS operating system --- Alpha hardware, Apple's OS X PowerPC laptop, and Sun's Solaris Ultra M20 AMD 64 bit dual core work station. As |THREAD_LIBRARY_TO_USE__| is binary valued for now, the value 1 selects the Microsoft thread library while the value 0 selects the |Pthread| library. @*2 Parsing trace variables.\fbreak To help in debugging a grammar, the following variables symbols are defined: |YACCO2_T__| , |YACCO2_TH__| , |YACCO2_TLEX__| , |YACCO2_MSG__| , |YACCO2_MU_GRAMMAR__| , and |YACCO2_AR__|. So far the tracing facilities fall into 3 catagories: trace the token when fetched, trace the message correspondence between threads, and trace the parsing stack of the grammar per action taken. Each symbol name tries by use of a suffix to indicate its functionality. For example, |_MSG__| suffix controls tracing of the messages between all threads and process. Specific arbitrator functor uses the |_AR__| suffix. These are workers supporting parallel parsing per grammar that require arbitration and thread control. The symbols are all binary expressions where ``1'' (one) includes their functionality. As parallel parsing can use many threads, to refine the volume of traced output, macros that use these symbols |YACCO2_TLEX__| ,|YACCO2_TH__| , and |YACCO2_AR__| also test whether their associated grammar has the fsm's debug parameter value of `true'. |YACCO2_TLEX__| symbol controls the specific tracings that are emitted by |Yacco2| in the \CPLUSPLUS/ code per rule. |YACCO2_MU_xxx__| helps to verify that mutexes are properly acquired and released. There are 2 contexts that mutexes are used:\fbreak \ptindent{1) global mutexes --- thread table and tracing} \ptindent{2) grammar mutex} To aid in identifying a grammar mutex, |(UN)LOCK_MUTEX_OF_CALLED_PARSER| external routines were created so that the grammar's context could be passed as a parameter. This allowed one to trace the grammar's name and assigned thread no. Why are |LOCK_MUTEX| and |UNLOCK_MUTEX| routines not sufficient? There are contexts where the parse context is too far down the chain of calls to pass the parser context or there is no parser context availible: eg, handle tracing by the grammar writer outside the parser context. @*2 Thread performance.\fbreak To get a feel of why threads are a tad sluglish, the |YACCO2_THP__| conditional was invented. It allows one to see the serpentine meanderings of how the thread library works: flow control dodos. When the environment is a single cpu, the flow control is how the cpu relinquishes control to the various threads. As cpus are added, this serpentine tracking becomes non-deterministic: That is, the traces are parallel or branched competing on the same race trace side-by-side where the number of lanes is the number of cpus actively running. @*2 Section organization.\fbreak To control the output of various |cweb| code sections, the section names and their order are as follows: |@|,|@|,|@|, and |@|. As include statements can take on different definitions: type, constant, structures, sometimes the dependency of the include file order is important particularly when the files are outside one's developmental control or there are circular references. For structures not defined yet but referenced, at the point of use, the standard \CPLUSPLUS/ statement will be added infront of the to-be-defined structure. Maybe a bit imperfect but pratical. So this is my take... @*2 C macros.\fbreak Conditionally defined macros for tracing. They are bracketed by the conditional preprocessor code controlling their inclusion. @= // c macros @*2 Include files.\fbreak To start things off, these are the Standard Template Library (STL) includes needed by Yacco2. @= @; @i "/usr/local/yacco2/library/gbl_defs.w" @*2 Typedef definitions.\fbreak These are the basic types to aid in porting or maintaining the code. Other sections will add to this section as they get developed. @= typedef const char* KCHARP; typedef unsigned char UCHAR; typedef char CHAR; typedef UCHAR* UCHARP; typedef unsigned short int USINT; typedef short int SINT; typedef CHAR* CHARP; typedef const void* KVOIDP; typedef void* VOIDP; typedef int INT; typedef unsigned int UINT; typedef unsigned int ULINT; typedef void (*FN_DTOR)(VOIDP This,VOIDP Parser); typedef UCHARP LA_set_type; typedef LA_set_type LA_set_ptr;@/ struct CAbs_lr1_sym; struct State; struct Parser; struct Shift_entry; struct Shift_tbl; struct Reduce_tbl; struct State_s_thread_tbl; struct Thread_entry; struct T_array_having_thd_ids; struct Set_entry; struct Recycled_rule_struct; struct Rule_s_reuse_entry; typedef Shift_entry Shift_entry_array_type [1024*100]; typedef Set_entry Set_entry_array_type [1024*100]; @*2 Recursion index for internal tracing of output.\fbreak Used to prefix spaces according to its count. Allows one to output messages to {\bf{lrclog}} where the prefix number of spaces is the recursive call level. @d Recursion_count() int RECURSION_INDEX__(0); @ Structure definitions. @+= // structures @*2 Global external variables from yacco2's linker.\fbreak Apart from |PTR_LR1_eog__| which is defined by the |yacco2_k_symbols.lex| grammar, yacco2's linker generates the balance of these symbol definitions. All these symbols are covered by namespace yacco2. They are dangling references within this library that get resolved by the regular language linker from other objects when the program is built. The first 5 symbols can only be defined by yacco2's linker due to the condition that all grammars and their threads must be known before these symbols can be defined specific to the developed language. Here we have a general piece of software that has dangling references of future knowns. @= // Global externals from yacco2's linker and |yacco2_k_symbols.lex| extern void* THDS_STABLE__; extern void* T_ARRAY_HAVING_THD_IDS__; extern void* BIT_MAPS_FOR_SALE__; extern int TOTAL_NO_BIT_WORDS__; extern int BIT_MAP_IDX__; extern CAbs_lr1_sym* PTR_LR1_eog__; @*2 Global tracing variables.\fbreak See |The preprocessor coding game| for their meanings. @= extern int YACCO2_T__; extern int YACCO2_TLEX__; extern int YACCO2_MSG__; extern int YACCO2_TH__; extern int YACCO2_AR__; extern int YACCO2_THP__; extern int YACCO2_MU_TRACING__; extern int YACCO2_MU_TH_TBL__; extern int YACCO2_MU_GRAMMAR__; @*2 Global variables.\fbreak @= // gbl variables @*2 External rtns. @= // extern rtns + gbl variables @ Using library's namespace yacco2. The acronyms should be obvious to the user within their context. @= using namespace yacco2; @ Begin namespace yacco2. @= namespace yacco2{ @ End namespace yacco2. @= }; // end namespace yacco2 @ Include Yacco2 header. @= #include "yacco2.h" @ Include Yacco2's raw characters header. @= #include "yacco2_characters.h" @ Include Yacco2's constants header. @= #include "yacco2_k_symbols.h" @ Include Yacco2's conditional compile control symbols header. @= #include "yacco2_compile_symbols.h" @ Include Yacco2's arbitrator's begin code. @= #include "war_begin_code.h" @ Include Yacco2's arbitrator's end code. @= #include "war_end_code.h" @ A wrapper file that brings in the required Standard Template Library (STL) containers used by Yacco2. @= #include #include #include #include "std_includes.h" #include @ Accrue yacco2 code. @= // accrue yacco2 code @*2 |cweb| output of Yacco2's user library.\fbreak The implementation code is emitted by |cweb|'s @@c or @@( operators throughout this discourse. Definitions etc are outputted to the common include file |yacco2.h|. All implementations will include this file into their implementation. @ Create header file for Yacco2 library environment. Note, the ``include search'' directories for the \CPLUSPLUS/ compiler has to be supplied. @(yacco2.h@>= @; #ifndef yacco2_ #define yacco2_ 1 @; @h @; @; @; @; @; @; @; @; @; namespace NS_yacco2_k_symbols { extern yacco2::CAbs_lr1_sym* PTR_LR1_questionable_shift_operator__; extern yacco2::CAbs_lr1_sym* PTR_LR1_eog__; extern yacco2::CAbs_lr1_sym* PTR_LR1_eolr__; extern yacco2::CAbs_lr1_sym* PTR_LR1_parallel_operator__; extern yacco2::CAbs_lr1_sym* PTR_LR1_fset_transience_operator__; extern yacco2::CAbs_lr1_sym* PTR_LR1_invisible_shift_operator__; extern yacco2::CAbs_lr1_sym* PTR_LR1_all_shift_operator__; }; @; #endif @*2 Yacco2's library implementation.\fbreak Start the code output to |yacco2.cpp| by appending its include file. @(yacco2.cpp@>= @; @; @; @;