Package: AwkProc

Introduction

The package AwkProc is meant to provide an AWK-like programming environment to Tcl. It is an all-Tcl package that simulates AWK's pattern-action model.

Like AWK it has special patterns to mark the beginning and end of a file, to process several files one after another and so on. It is, however, not a complete emulation of AWK.

This document describes what it can and can not do, or at least, not without additional programming. Since it is set up as an ordinary Tcl package, anything that is lacking in capability can in principle be added by proper Tcl code.

The set-up of this document is as follows:

Version and copyright

This document describes AwkProc, version 0.2, june 2001.

Usage of AwkProc is free, as long as you acknowledge the author, Arjen Markus (e-mail: arjen.markus@wldelft.nl).

There is no guarantee nor claim that the results are accurate.

Procedures

The AwkProc package defines the following public procedures: Pattern matching works as follows: Note:
As the global variables, especially LINE, are not protected, you can use this fact to prepare the input for further processing. For instance:

#
# First convert the input to lower-case, and trim any blanks
# - this makes the pattern much simpler
#
RegPattern "." {
   set LINE [string trim ]string tolower [$LINE]]
}

Comparison

This section provides a concise comparison between the AwkProc package and the AWK language as known from UNIX. As AwkProc is by no means meant to be complete, there are quite a few limitations with respect to AWK. However, as AwkProc is written in Tcl, users can take advantage of a much more widely applicable scripting language.

Some obvious limitations of AwkProc:

Some obvious advantages of AwkProc:

An example

This section presents a small example that is intended to illustrate the type of processing that can be done, rather than something practical or complete.

The purpose is to:

Extracting the procedure headers (that is, the procedure name and arguments) is easy, as long as they appear on one line. In regular Tcl this would read:

if { [regexp {^[ ]*proc } $LINE] } { ... }
Similarly, finding the keyword "global" is easy. Just replace "proc" by "global".

Because we want to know how many lines of code (disregarding comments, which only complicates matters) there are between the procedure headers, we define a single parameter, anchor, which is the line number of the last "proc" encountered.

So, we end up, together with some print statements and some calculations with a script like this:


package require AwkProc

::AwkProc::BeginFile {
   ::AwkProc::Content set anchor 0
}

::AwkProc::RegPattern {^[ ]*proc } {
   if { [::AwkProc::Content get anchor] > 0 } {
      set numlines [expr $NL-[::AwkProc::Content get anchor]]
      puts "   (roughly $numlines lines of code)\n"
   }
   puts "[format "%5d: %s" $NL $LINE]"
   ::AwkProc::Content set anchor $NL
}

::AwkProc::RegPattern {^[ ]*global } {
   puts "   Global(s): [lrange $LINE 1 end]"
}

::AwkProc::EndFile {
   ::AwkProc::Content set anchor 0
}

::AwkProc::ProcessFiles $::args

Limitations and bugs

The list below is essentially a to-do list: