Hancock
Hancock is a C-based domain-specific language designed to make it easy
to read, write, and maintain programs that manipulate large amounts of
relatively uniform data. Because Hancock is embedded in C, it
inherits all the functionality of C. Valid C programs are also valid
Hancock programs, and Hancock programs can use libraries written for
C. But Hancock is more than C. In addition to C constructs, Hancock
provides domain-specific forms to facilitate large-scale data
processing.
For a given data-processing task, Hancock may be suitable if:
- The task requires a small number of linear passes over a
relatively uniform data source.
- The task requires storing persistent information.
New in Version 2.0
Hancock 2.0 adds the notion of parameterization for Hancock's
directory, map, pickle, and stream types. This mechanism reduces
the number of distinct types that a programmer must specify without
sacrificing the data safety gained from Hancock's static and dynamic
type checking. Parameterization can also help with processing
variable-width data.
Hancock 2.0 maps are always keyed by values of type long long.
Such maps use a new mechanism for conveying the legal range
of keys and for tuning their on-disk representations.
Hancock 2.0 introduces a new type, called generative streams, for
streams that do not have a physical representation.
Finally, Hancock 2.0 comes with a small library of stream
types, which facilitate iterating over various combinations of maps.
Version 2.0.1
Version 2.0.1 fixes a performance bug introduced in Hancock 2.0 and
a few other minor bugs.
Version 2.0.2
Hancock 2.0.2 updates the Hancock 2.0.1 release to work with current operating systems and C libraries.
Obtaining Hancock
Hancock
distributions and installation instructions
are available.
Documentation and papers
-
Hancock Manual
( ps)
( pdf)
K.Fisher, K.Hogstedt, A.Rogers, and F.Smith.
-
Journal paper
( ps)
( pdf)
K.Fisher, K.Hogstedt, A.Rogers, and F.Smith.
-
Hancock: A Language for Extracting Signatures from Data Streams
(
Abstract)
( ps)
( pdf)
C.Cortes, K.Fisher, D.Pregibon, A.Rogers, and F.Smith.
In Proceedings of the Sixth International Conference
on Knowledge Discovery and Data Mining,
2000, pages 9-17.
Conference slides
( ps)
( pdf).
-
Hancock: A Language for Processing Very Large-Scale Data
(Abstract)
( ps)
( pdf)
D.Bonachea, K.Fisher, A.Rogers, and F.Smith.
In USENIX 2nd Conference on Domain-Specific Languages,
1999, pages 163-176.
People
Affiliates
Inquiries
Please send inquiries to: hancock@research.att.com.
|