Go to the first, previous, next, last section, table of contents.


@dircategory GNU Packages * Gawk: (gawk). A text scanning and processing language. @dircategory Individual utilities * awk: (gawk)Invoking gawk. Text scanning and processing. @set DOCUMENT Web page @set CHAPTER chapter @set APPENDIX appendix @set SECTION section @set SUBSECTION subsection @set DARKCORNER (d.c.)

@ifnottex

"To boldly go where no man has gone before" is a Registered Trademark of Paramount Pictures Corporation.
Copyright (C) 1989, 1991, 1992, 1993, 1996-2001 Free Software Foundation, Inc.

This is Edition 3 of GAWK: Effective AWK Programming: A User's Guide for GNU Awk, for the 3.1.0 (or later) version of the GNU implementation of AWK.

Published by:

Free Software Foundation
59 Temple Place -- Suite 330
Boston, MA 02111-1307 USA
Phone: +1-617-542-5942
Fax: +1-617-542-2652
Email: gnu@gnu.org
URL: http://www.gnu.org/

ISBN 1-882114-28-0

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with the Invariant Sections being "GNU General Public License", the Front-Cover texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in the section entitled "GNU Free Documentation License".

  1. "A GNU Manual"
  2. "You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development."

Cover art by Etienne Suvasa.

To Miriam, for making me complete. To Chana, for the joy you bring us. To Rivka, for the exponential increase. To Nachum, for the added dimension. To Malka, for the new beginning.

Foreword

Arnold Robbins and I are good friends. We were introduced 11 years ago by circumstances--and our favorite programming language, AWK. The circumstances started a couple of years earlier. I was working at a new job and noticed an unplugged Unix computer sitting in the corner. No one knew how to use it, and neither did I. However, a couple of days later it was running, and I was root and the one-and-only user. That day, I began the transition from statistician to Unix programmer.

On one of many trips to the library or bookstore in search of books on Unix, I found the gray AWK book, a.k.a. Aho, Kernighan and Weinberger, The AWK Programming Language, Addison-Wesley, 1988. AWK's simple programming paradigm--find a pattern in the input and then perform an action--often reduced complex or tedious data manipulations to few lines of code. I was excited to try my hand at programming in AWK.

Alas, the @command{awk} on my computer was a limited version of the language described in the AWK book. I discovered that my computer had "old @command{awk}" and the AWK book described "new @command{awk}." I learned that this was typical; the old version refused to step aside or relinquish its name. If a system had a new @command{awk}, it was invariably called @command{nawk}, and few systems had it. The best way to get a new @command{awk} was to @command{ftp} the source code for @command{gawk} from prep.ai.mit.edu. @command{gawk} was a version of new @command{awk} written by David Trueman and Arnold, and available under the GNU General Public License.

(Incidentally, it's no longer difficult to find a new @command{awk}. @command{gawk} ships with Linux, and you can download binaries or source code for almost any system; my wife uses @command{gawk} on her VMS box.)

My Unix system started out unplugged from the wall; it certainly was not plugged into a network. So, oblivious to the existence of @command{gawk} and the Unix community in general, and desiring a new @command{awk}, I wrote my own, called @command{mawk}. Before I was finished I knew about @command{gawk}, but it was too late to stop, so I eventually posted to a comp.sources newsgroup.

A few days after my posting, I got a friendly email from Arnold introducing himself. He suggested we share design and algorithms and attached a draft of the POSIX standard so that I could update @command{mawk} to support language extensions added after publication of the AWK book.

Frankly, if our roles had been reversed, I would not have been so open and we probably would have never met. I'm glad we did meet. He is an AWK expert's AWK expert and a genuinely nice person. Arnold contributes significant amounts of his expertise and time to the Free Software Foundation.

This book is the @command{gawk} reference manual, but at its core it is a book about AWK programming that will appeal to a wide audience. It is a definitive reference to the AWK language as defined by the 1987 Bell Labs release and codified in the 1992 POSIX Utilities standard.

On the other hand, the novice AWK programmer can study a wealth of practical programs that emphasize the power of AWK's basic idioms: data driven control-flow, pattern matching with regular expressions, and associative arrays. Those looking for something new can try out @command{gawk}'s interface to network protocols via special `/inet' files.

The programs in this book make clear that an AWK program is typically much smaller and faster to develop than a counterpart written in C. Consequently, there is often a payoff to prototype an algorithm or design in AWK to get it running quickly and expose problems early. Often, the interpreted performance is adequate and the AWK prototype becomes the product.

The new @command{pgawk} (profiling @command{gawk}), produces program execution counts. I recently experimented with an algorithm that for n lines of input, exhibited @ifnottex ~ C n^2 performance, while theory predicted @ifnottex ~ C n log n behavior. A few minutes poring over the `awkprof.out' profile pinpointed the problem to a single line of code. @command{pgawk} is a welcome addition to my programmer's toolbox.

Arnold has distilled over a decade of experience writing and using AWK programs, and developing @command{gawk}, into this book. If you use AWK or want to learn how, then read this book.

Michael Brennan
Author of @command{mawk}


Go to the first, previous, next, last section, table of contents.