The Packrat Parsing and
Parsing Expression Grammars Page
Introduction
Parsing expression grammars (PEGs) are
an alternative to context free grammars for formally specifying syntax, and
packrat parsers are parsers for PEGs
that operate in guaranteed linear time through the use of memoization.
For a brief technical summary see
the Wikipedia entry on PEGs.
For more in-depth descriptions see the original
PEG paper and
packrat parsing paper,
and related papers below.
Bryan Ford, the maintainer of this site,
coined the modern terms PEGs and packrat parsing,
but much of their formal theory existed earlier
and all the more recent work on this topic is by others.
Mailing List
The following mailing list is open to anyone
wishing to announce software or papers related to PEGs and packrat parsing,
or discuss ideas and issues:
The PEG Mailing List
PEG/Packrat Parsing Bibliography
The following research papers
develop and explore PEGs and packrat parsing in detail;
most were written by various authors working independently.
The papers are listed chronologically starting from most recent.
Some of the papers have associated code available online.
-
From Regular Expressions to Parsing Expression Grammars.
Sérgio Medeiros,
Fabio Mascarenhas, and
Roberto Ierusalimschy.
SBLP,
September 2011.
-
Parsing Expression Grammars for Structured Data.
Fabio Mascarenhas,
Sérgio Medeiros,
and
Roberto Ierusalimschy.
SBLP,
September 2011.
-
LL(*): The Foundation of the ANTLR Parser Generator.
Terence Parr and
Kathleen Fisher,
PLDI,
June 2011.
-
TRX: A Formally Verified Parser Interpreter
(PDF).
Logical Methods in Computer Science,
June 2011.
-
BITES instead of FIRST for Parsing Expression Grammar (PDF).
Roman R. Redziejowski,
Fundamenta Informaticae 109(3),
2011.
-
Direct Left-Recursive Parsing Expression Grammars
(PDF).
Laurence Tratt,
Technical report EIS-10-01, Middlesex University,
October 2010.
-
Converting regexes to Parsing Expression Grammars
(PDF).
Marcelo Oikawa,
Roberto Ierusalimschy, and Ana Lúcia de Moura.
SBLP,
September 2010.
-
Packrat Parsers Can Handle Practical Grammars in Mostly Constant Space (PDF).
Kota Mizushima, Atusi Maeda, and Yoshinori Yamaguchi.
PASTE,
June 2010.
-
Mouse: From Parsing Expressions to a Practical Parser (PDF).
Roman R. Redziejowski,
CS&P 2009,
September 2009.
-
Applying classical concepts to Parsing Expression Grammar (PDF).
Roman R. Redziejowski,
Fundamenta Informaticae 93(1-3),
2009.
-
A Text Pattern-Matching Tool based on
Parsing Expression Grammars (PDF).
Roberto Ierusalimschy,
Software: Practice and Experience 39(3),
March 2009.
-
Packrat Parsing in Scala (PDF).
Project Report, Manohar Jonnalagedda, EPFL,
January 2009.
-
Some Aspects of Parsing Expression Grammar (PDF).
Roman R. Redziejowski,
Fundamenta Informaticae 85(1-4),
2008.
-
A Parsing Machine for PEGs (PDF).
Sérgio Medeiros and Roberto Ierusalimschy.
DLS,
July 2008.
-
Packrat parsers can support left recursion (PDF).
Alessandro Warth,
James R. Douglass, and
Todd Millstein,
PEPM '08,
January 2008.
-
DCGs + Memoing = Packrat Parsing: But is it worth it?
Ralph Becket and
Zoltan Somogyi,
PADL '08,
January 2008.
-
OMeta: an Object-Oriented Language for Pattern Matching (PDF).
Alessandro Warth and
Ian Piumarta,
DLS 2007,
October 2007.
-
A Programming Language Where the Syntax and Semantics
Are Mutable at Runtime
(PDF).
Chris Seaton,
Master's thesis, University of Bristol, May 2007.
-
Parsing Expression Grammar as a primitive recursive-descent parser
with backtracking (PDF)
Roman Redziejowski,
CS&P'2006,
September 2006.
-
Modular Syntax Demands Verification (PDF).
Sylvain Schmitz,
Tech Report I3S/RR-2006-32-FR,
Université de Nice,
October 2006.
-
Better Extensibility through Modular Syntax (PDF).
Robert Grimm,
PLDI,
June 2006.
-
Parsing Expression Grammars:
A Recognition-Based Syntactic Foundation.
(PDF).
Bryan Ford,
POPL,
January 2004.
-
Packrat Parsing:
Simple, Powerful, Lazy, Linear Time.
Bryan Ford,
(PDF).
ICFP,
October 2002.
-
Packrat Parsing:
a Practical Linear-Time Algorithm with Backtracking
(PDF).
Bryan Ford,
Master's thesis, MIT, September 2002.
-
Parsing algorithms with backtrack.
Alexander Birman and Jeffrey D. Ullman,
Information and Control, 23(1):1-34, August 1973.
-
The TMG Recognition Schema (PDF).
Alexander Birman,
Ph.D. dissertation, Princeton, February 1970.
PEG-related Projects
The following projects have implemented PEG parsers, parser generators,
and/or combinator libraries in a variety of languages:
- Multi-language:
- ANTLR,
a well-established parser generator
by Terence Parr,
supports extensive PEG features
and combines packrat parsing with LL parsing techniques.
- Waxeye by
Orlando Hill,
a PEG-based parser generator supporting
C, Java, JavaScript, Python, Ruby, and Scheme.
- parboiled
is a parser library for Java and Scala
that interprets parsers expressed "in-line" in Java/Scala code.
- vembyr
supports C++, Python, and Ruby.
- Java:
- Python:
-
The pyparsing
monadic parsing combinator library
by Paul McGuire
now supports packrat parsing.
-
Packrat parsing support
has also been incorporated into
PyPy.
-
pyPEG
is a PEG parser interpreter for Python by Volker Birk.
- Haskell:
-
Frisby
by John Meacham
is a monadic packrat parser library for Haskell
that uses advanced Haskell type system features
to support dynamic specification of composable parsers.
-
Pappy
by Bryan Ford
is a simple prototype packrat parser generator for Haskell.
Not actively supported.
- C, C++:
-
The PEG Template Library
for C++0x
by Colin Hirsch expresses grammars via C++ templates.
-
The peg/leg
parser generator emphasizes convenience for those
already familiar with lex/yacc.
-
chilon is a template-based
parser library that builds ASTs from the same PEG spec.
- C#:
-
NPEG is a library
providing objects to build PEGs incrementally in C#.
-
IronMeta
is a C# implementation of
OMeta.
- JavaScript:
-
OMeta
supports PEG-based pattern matching extended to handle
arbitrary kinds of data.
-
PEG.js
is a parser generator that produces fast parsers
with excellent error reporting.
- Tcl:
The new
grammar::peg module
supports grammar definition and parsing using PEGs.
- Smalltalk:
OMeta
provides PEG-based pattern matching
for the Squeak
programming environment.
- Scheme:
- Common Lisp:
-
CL-peg
by John Leuner supports PEGs and packrat parsing in Common Lisp.
- cl-peg,
for Common Lisp.
- Lua:
Roberto Ierusalimschy has provided
the LPeg pattern-matching library.
- Ruby
now has the Treetop
grammar description and packrat parsing language.
- Dylan:
peg-parser
library by Dustin Voss.
Sample PEGs
This section contains pointers to some "nontrivial" grammars
expressed as PEGs or PEG-like languages:
- xtc
contains modular Rats! grammars for C and Java.
- Mouse
includes grammars for Java and C.
- parboiled
includes a grammar for Java.
- PEG.js
includes a grammar for JavaScript/ECMAScript.
Maintainer: Bryan Ford.
Additions or corrections to this page are welcome —
with apologies if I sometimes take a long time to respond!