[Set] PARSER parser fileidUse the
The
For example, if you were working with a hypothetical language called
LANG and you had described the language in a KEDIT Language Definition
file called LANGDEF.KLD, you could define a parser called LANG
with the command
If files in your language always had an extension of, for example,
.LNG, you could use the SET AUTOCOLOR command to tell KEDIT to always
use the LANG parser for .LNG files:
Copies of all of the KLD files built into KEDIT are included in the
SAMPLES subdirectory of the main KEDITW directory. For example, there
is a C.KLD file that is an exact copy of the *C.KLD file that is built
into KEDIT. If you modify one of these copies you should save it in
a different location (normally the USER subdirectory of the main KEDITW
directory) and load it by issuing a SET PARSER command referring to
the modified file.
Note that whenever you issue the SET PARSER command, the KLD file
that you specify is loaded into memory, even if an identical SET PARSER
command has previously been issued. This makes it easy to develop
and test modifications to KLD files, because if you make changes to
a KLD file you can simply reissue the appropriate SET PARSER command
and KEDIT will load the updated version of the file. Any files whose
syntax coloring is controlled by your parser will automatically be
re-colored, so you can easily see the effect of the changes you have
made to the KLD file.
The rules given here for KLD files are flexible enough to describe
a number of popular programming languages, to handle varying syntax
conventions for comments, strings, numbers, etc., and to have user-configurable
lists of keywords. The goal is to handle many common language variants
with a relatively small number of parameters.
KLD files are divided into sections. Each section begins with a section
header, consisting of a colon in column one followed immediately by
the section name. Following each section header line are one or more
lines of parameter information.
To improve readability, you can insert blank lines at any point in
a KLD file. Additionally, any line whose first nonblank character
is an asterisk (``*'') is considered a comment line and is ignored
by KEDIT. For example:
Here are descriptions of each kind of KLD file section:
An example:
PREPROCESSOR indicates that the language supports
a C-like preprocessor mechanism, and that preprocessor keywords are
preceded by the specified character. For example:
REXX indicates that the REXX language is being described.
In REXX, certain identifiers are sometimes considered keywords and
are sometimes considered variables, depending on the context in which
they are used, and the REXX option tells KEDIT to do the special processing
that this requires.
If the :OPTION section is omitted, KEDIT does not do special handling
of preprocessor keywords or of REXX keywords.
In many languages, there are different rules for what is valid as
the first character of an identifier and for what is valid in additional
characters in an identifier. To handle this situation, you can include
two identifier specifications: first specify what is valid as the
first identifier character and then specify what is valid in the remaining
characters. For example, in C programs the first character of an identifier
can be any alphabetic character or can be an underscore, while the
remaining characters of an identifier can be alphabetic or can be
underscores, but can also be numeric digits:
Some languages have single-line comments, which are introduced by
some type of comment delimiter and cannot continue for multiple lines.
Some languages have comments with both a starting and an ending delimiter.
This kind of comment can usually continue for multiple lines, but
in some languages may be restricted to a single line.
For example, C++ allows comments that are introduced by a pair of
slashes (``//'') and continue until the end of the line. C++ also
allows comments that can continue for multiple lines, introduced by
a slash-asterisk pair (``/*'') and terminated by an asterisk-slash
pair (``*/''). The corresponding :COMMENT section would be:
indicates that appearance of the comment delimiter anywhere
on a line (except within a quoted string) starts a comment.
indicates that the comment delimiter starts a comment
only if it is the first nonblank item on a line.
indicates that the comment delimiter starts a comment
only if it appears in column
Comments with both starting and ending delimiters are described using
the format
NEST indicates that multi-line comments can be nested inside
multi-line comments, with the comments ending only when as many comment
end delimiters as comment start delimiters have been encountered.
NONEST is the default and indicates that comments cannot be nested,
and that a comment ends as soon as the next comment end delimiter
has been encountered. For example, consider
In the REXX language, which allows nested comments,
``x indicates that the comments can continue for multiple
lines; this is the default and need not be specified.
indicates that, even though paired delimiters are being
used, the comments must begin and end on a single line.
Header lines are specified in the same way as single-line comments:
This means that your language uses strings enclosed
in single quotes.
This means that your language uses strings enclosed
in double quotes.
Use this to specify that the character
SINGLE, DOUBLE, and DELIMITER
If the :STRING section is omitted, KEDIT's syntax coloring does not
recognize any strings in your files.
Instead of a DELIMITER line, you can specify
KEDIT's syntax coloring facility uses the information in the :MATCH
section for two purposes:
First, items at different nesting levels are colored differently,
so you can easily see which items match. For example, in the line
Second, when you use the CMATCH command (assigned by default to Shift+F3)
to find the matching item for the text at the cursor position, KEDIT
can properly match any items described in the :MATCH section. With
the cursor on the first DO in the following example, Shift+F3 can
move the cursor to the second END in the example:
For example,
Keywords are normally colored according to the current ECOLOR D setting,
and preprocessor keywords according to the current ECOLOR F setting.
It is sometimes useful to specify different types of keywords that
will be colored differently. To do this, you can specify
A sample :KEYWORD section:
Use the TAG line to specify the character string that initiates a
markup tag and the character string that terminates a markup tag.
In an HTML file, where a typical line of text might be:
HTML lets you use entity references like ``<'' or character references
like ``<'' to refer to special characters. These references begin
with an ampersand (``&'') and end with a semi-colon (``;''). This
would be specified in the :MARKUP section as:
When the syntax coloring parser processes a line of your file, it
will treat the excluded columns as if they were entirely blank. By
default, the excluded columns will be displayed with no special highlighting,
but you can specify that any of the 9 ALTERNATE colors be used. For
example,
The :POSTCOMPARE can contain CLASS lines and TEXT lines.
CLASS lines specify a set of characters that you want to have colored,
using the same regular expression character class notation that is
used in the :IDENTIFIER section. For example,
Note that it is not useful to include valid identifiers in the :POSTCOMPARE
section, since the parser checks for identifiers before :POSTCOMPARE
is processed, so identifiers, even identifiers that are not listed
in the :KEYWORD section, will never be matched by :POSTCOMPARE. For
this reason, any identifiers that you want to color should be included
in the :KEYWORD section.
SET PARSER LANG LANGDEF.KLD
After issuing the SET PARSER command, you could then issue the command
SET COLORING ON LANG
to use this parser to control syntax coloring for the current file.
SET AUTOCOLOR .LNG LANG
SET PARSER commands are typically executed from your KEDIT profile
when KEDIT is initially loaded. For example:
* if first profile execution in a session,
* setup the LANG parser and then
* cause all .LNG files to be colored using the LANG parser
if initial() then do
'set parser lang langdef.kld'
'set autocolor .lng lang'
end
Several language definitions are built into KEDIT, and when KEDIT
is loaded it automatically issues SET PARSER commands that use these
language definitions to set up its default parsers. See the description
of the SET PARSER command for a complete list of built-in parsers.
To distinguish these internal language definition files from actual
disk files, KEDIT uses an asterisk as the first character of their
names. For example, the command
SET PARSER C *C.KLD
tells KEDIT to use *C.KLD as the Language Definition File associated
with the C parser. The asterisk in the name tells KEDIT to use the
special file *C.KLD, which is built into KEDIT, and not to look for
the file on disk.
KLD File Format
Here is a description of the format
of KEDIT Language Definition files, which usually have an extension
of .KLD. The best way to get started with KLD files is to look over
this description briefly, and then to examine some of the KLD files
that are included in the SAMPLES directory of the main KEDITW directory.
* Sample KLD contents
:case
ignore
:identifier
[a-z] [a-z0-9]
:keyword
if
then
else
The above example starts with a comment line, followed by a :CASE
section with one parameter line, an :IDENTIFIER section with one parameter
line, and a :KEYWORD section with three parameter lines. Parameter
information is usually indented from column one, as in this example,
but it does not have to be.
:CASE section
The :CASE section consists of a single line with the word RESPECT
or the word IGNORE. RESPECT means that the language you are describing
is case-sensitive (for example, ``else'' and ``ELSE'' are not considered
identical), and IGNORE means that the language is case-insensitive.
:CASE
respect
If the :CASE section is omitted, KEDIT assumes case insensitivity.
If present, the :CASE section must precede the :IDENTIFIER section.
:OPTION section
The :OPTION section consists of a single line containing special options
that are needed to properly process some languages. There are currently
two possible options:
PREPROCESSOR char
:OPTION
preprocessor #
REXX
:IDENTIFIER section
The :IDENTIFIER section consists of a single line that specifies what
characters can appear within identifiers in the language you are describing.
These characters are specified in the same way as character class
specifications within KEDIT regular expressions. They consist of lists,
enclosed in square brackets, of valid characters and/or ranges of
valid characters (with the first character in the range, a minus sign,
and the last character in the range). For example,
:IDENTIFIER
[a-zA-Z]
specifies that any set of alphabetic characters is a valid identifier.
:IDENTIFIER
[a-zA-Z_] [a-zA-Z0-9_]
In some cases (BASIC programs are the main example), the last character
of an identifier can be a special character that is not valid elsewhere
in an identifier. For example, in BASIC, ABC@ is a valid identifier.
To handle this, you can include a third item specifying the special
characters acceptable only at the end of an identifier. For example:
:IDENTIFIER
[a-zA-Z] [a-zA-Z0-9_] [%&!#@$]
The :IDENTIFIER section is required if you will be using the :KEYWORD
section to give a list of the keywords in your language. The :IDENTIFIER
section must appear before the :KEYWORD section.
:COMMENT section
Use the :COMMENT section to describe the rules for comments in your
language. Each line of the :COMMENT section describes one type of
comment; since some languages have multiple methods for specifying
comments, there may be multiple lines in the :COMMENT section.
:COMMENT
line // any
paired /* */ nonest
Line comments are described using the format
LINE delim ANY|FIRSTNONBLANK|COLUMN n
where ANY
FIRSTNONBLANK
COLUMN n
PAIRED delim1 delim2 [NEST|NONEST] [MULTIPLE|SINGLE]
where NEST|NONEST
/*
/* here is a comment */
x = 17
*/
MULTIPLE
SINGLE
:HEADER section
The :HEADER section describes header lines. Header lines are used
to indicate the start of a new section in certain types of files;
the section headers in .KLD files are examples of header lines.
LINE delim ANY|FIRSTNONBLANK|COLUMN n
As far as KEDIT's syntax coloring is concerned, the only difference
between single-line comments and headers is that comments are displayed
using ECOLOR A and headers are displayed using ECOLOR G. An example
of a :HEADER section that describes .KLD file section headers:
:HEADER
line : column 1
:STRING section
Use the :STRING section to describe the types of quoted strings used
in your language. Each line of the :STRING section describes one type
of string; since some languages have multiple methods for specifying
strings, there may be multiple lines in the :STRING section. There
are three possibilities:
SINGLE
DOUBLE
DELIMITER c
:NUMBER section
Use the :NUMBER section to indicate the format of numbers in your
language. The :NUMBER section is a single line long, with the word
INTEGER, DECIMAL, C, COBOL, PASCAL, REXX, or ADA.
:LABEL section
Use the :LABEL section to define what counts as a label in your language.
The label section normally consists of a single line, but can involve
multiple lines if your language has multiple ways of specifying labels.
The label description has the format
DELIMITER delim FIRSTNONBLANK|ANY|COLUMN n
where COLUMN n
to indicate that any non-keyword identifier beginning in the specified column
should be treated as a label, with no need for a delimiter following the label.
:MATCH section
Use the :MATCH section to specify the matching characters and identifiers
that indicate nested structure within your language. For example,
in most languages, left and right parentheses can be nested and must
match up properly in a syntactically correct program. In some languages
the same is true of keywords like BEGIN and END.
if (f(x + y + z) = 17)
KEDIT can display the inner parentheses and the outer parentheses
in different colors.
if a = 5 then do
j = 17
do i = 1 to 10
say i*j
end
end
Each line of the :MATCH section has either two or three items. The
first item specifies the identifiers or character sequences that introduce
a matchable construct. The second item specifies the identifiers or
character sequences that end a matchable construct. The third item
is optional, and is used to specify items that always appear inside
of a matchable construct.
:MATCH
( )
{ }
#if #endif #else
Here, three matchable constructs are specified:
:MATCH
( )
{ }
#ifdef,#if,#ifndef #endif #else,#elif,#elseif
This is because any of #ifdef, #if, and #ifndef can match up with
#endif, with any of #else, #elif, and #elseif allowed between them.
As in this example, you can specify multiple equivalent items in a
:MATCH section, separated by commas.
Some notes on using the :MATCH section:
:MATCH
DO END
BEGIN END
but should instead use
:MATCH
DO,BEGIN END
:KEYWORD section
Use the keyword section to specify the keywords in your language.
Each line of the keyword section has the form
keyword [ALTERNATE n] [TYPE m]
where ALTERNATE n
following a keyword, where TYPE m
is used only when REXX has been specified in the :OPTION section,
and determines what to treat as a REXX keyword, subkeyword, etc. The
number :KEYWORD
if
then
else
do
end
switch
for
procedure alternate 1
If the :KEYWORD section is omitted, KEDIT's syntax coloring facility
does not recognize any keywords. If the :KEYWORD section is specified,
it must be preceded by the :IDENTIFIER section.
:MARKUP section
The :MARKUP section is used with HTML and similar markup languages.
It can contain a TAG line and, optionally, a REFERENCE line.
<H1>Level 1 header</H1>
``<'' initiates a tag, and ``>'' terminates it. This would be specified
in the :MARKUP section as
:MARKUP
TAG < >
Use the REFERENCE line to specify the character string that initiates
a character or entity reference and the character string that terminates
it.
:MARKUP
TAG < >
REFERENCE & ;
The following special rules apply if your KLD file contains a :MARKUP
section:
<P>This is a new paragraph.
``<P>'' would be highlighted.
<A HREF="film_clip.jpg">
the quoted string is displayed using ECOLOR B, while
the rest of the tag is displayed using ECOLOR T.
:COLUMN section
Use the :COLUMN section to specify that the parser should ignore certain
columns of your file. For example, in COBOL columns 1 through 6 of
a file and all columns beyond column 72 of a file are ignored by the
compiler. This would be specified as
:COLUMN
EXCLUDE 1 6
EXCLUDE 73 *
Each line of the :COLUMN section has the word EXCLUDE followed by the starting
and ending column of a range of columns that the parser is to ignore.
The ending column can be given as an asterisk to indicate that all
columns through the end of the line are to be ignored.
:COLUMN
EXCLUDE 1 10 ALTERNATE 2
would display columns 1 through 10 of your file using ECOLOR 2.
:POSTCOMPARE section
The :POSTCOMPARE section is used to color character sequences that
are not handled by any of the other sections of a KLD file. For example,
you might want to color operators like ``+'', ``-'', and ``='', or
items like ``.T.'' and ``.F.'', which indicate True or False in xBase
programs but are not valid identifiers.
CLASS [+-=/]
means that ``+'', ``-'', ``='', and ``/'' characters are to be colored.
KEDIT uses ECOLOR I by default, but you can instead specify any of
the four alternate keyword colors. For example:
CLASS [+-=/] ALTERNATE 2
TEXT lines specify a string of nonblank characters that is to be colored.
For example,
TEXT .T.
would color the character sequence ``.T.''. KEDIT uses ECOLOR D by
default, but you can specify an alternate keyword color. For example:
TEXT .T. ALTERNATE 3
You can specify any number of CLASS or TEXT lines in a :POSTCOMPARE
section. When applying syntax coloring to your file, the :POSTCOMPARE
section is processed last. That is, KEDIT first checks for identifiers,
numbers, comments, tags, etc., and checks the items in the :POSTCOMPARE
section only if none of these are found.
KEDIT Overview |
Download Libraries |
Maintenance Releases
Ordering/Licensing |
Demo Version |
Technical Support |
What's New
KEDIT for Windows 1.6.1 Upgrade |
KEDIT Mailing List
Copyright © 1996-2012 Mansfield Software Group, Inc.