AT&T Labs - Research
AT&T  

Hancock

Hancock is a C-based domain-specific language designed to make it easy to read, write, and maintain programs that manipulate large amounts of relatively uniform data. Because Hancock is embedded in C, it inherits all the functionality of C. Valid C programs are also valid Hancock programs, and Hancock programs can use libraries written for C. But Hancock is more than C. In addition to C constructs, Hancock provides domain-specific forms to facilitate large-scale data processing.

For a given data-processing task, Hancock may be suitable if:

  • The task requires a small number of linear passes over a relatively uniform data source.
  • The task requires storing persistent information.

New in Version 2.0

Hancock 2.0 adds the notion of parameterization for Hancock's directory, map, pickle, and stream types. This mechanism reduces the number of distinct types that a programmer must specify without sacrificing the data safety gained from Hancock's static and dynamic type checking. Parameterization can also help with processing variable-width data.

Hancock 2.0 maps are always keyed by values of type long long. Such maps use a new mechanism for conveying the legal range of keys and for tuning their on-disk representations. Hancock 2.0 introduces a new type, called generative streams, for streams that do not have a physical representation. Finally, Hancock 2.0 comes with a small library of stream types, which facilitate iterating over various combinations of maps.

Version 2.0.1

Version 2.0.1 fixes a performance bug introduced in Hancock 2.0 and a few other minor bugs.

Version 2.0.2

Hancock 2.0.2 updates the Hancock 2.0.1 release to work with current operating systems and C libraries.

Obtaining Hancock

Hancock distributions and installation instructions are available.

Documentation and papers

People

Affiliates

Inquiries

Please send inquiries to: hancock@research.att.com.