You are welcome to download and use the software tools appearing on this page that have been developed by AT&T Labs researchers. Please reference the individual project web pages for specific license agreements. If an available license agreement does not meet your needs, please contact firstname.lastname@example.org for assistance with a customized license.
AST: Advanced Software Technologies Open Source Collection
AST (AT&T Software Technology) OpenSource software is a collection of libraries and commands for UNIX and Windows. Included are re-implementations of many POSIX and X/Open APIs and utilities. It provides a portable and efficient environment that behaves consistently across a range of operating system and hardware implementations. portable because one collection of source builds unattended on all target architectures, efficient because the underlying algorithms are continually updated to match best in class performance.
Some of the more popular components include: cdt, dss, ksh93, nmake, pax, sfio, vcodex, and vmalloc. AST has been used internally in AT&T since the mid 1980's, and was released as OpenSource in 1999. Documentation, source and binaries are currently available under the OpenSource Eclipse Public License 1.0 at http://github.com/att/ast.
CDT: Container Data Types Library
A Container Data Type library. This provides all common containers such as list, stack, queue, ordered set/bag, unordered set/bag, etc. based on a uniform API. All containers can be used in a concurrent environment with either multiple threads or multiple processes (using shared memory). There is also a hash table data structure that provides lock-less reads and less-lock updates using atomic scalar operations.
ECharts: A state machine-based programming language
ECharts is a state machine-based programming language for event-driven systems derived from the standardized UML Statecharts language. ECharts distinguishes itself from other Statecharts dialects by focusing on implementation issues such as determinism and code re-use. Like Statecharts, ECharts supports hierarchical state machines, concurrent machines and a graphical syntax. Unlike Statecharts, ECharts supports a simple textual syntax, machine reuse, multiple transition priority levels to minimize non-determinism, machine arrays, and a new approach to inter- and intra- machine communication. ECharts is a hosted language which means that it is dependent on an underlying programming language such as Java. ECharts has a proven track-record in a large-scale commercial deployment.
FastRWeb: FastRWeb is an infrastructure for web-based reporting, data analysis and visualization using R.
It leverages the capabilities of R such as graphics, statistical models, data access and manipulation to make it easily available on the Web. It can be also used to quickly create REST services based on data and/or analyses.
GGobi: Data visualization for high-dimensional data
GGobi is an open source visualization program for exploring high-dimensional data. It provides highly dynamic and interactive graphics such as tours, as well as familiar graphics such as the scatterplot, barchart and parallel coordinates plots. Plots are interactive and linked with brushing and identification.
GSDjVu/DjVuDigital: Ghostscript driver to convert PS and PDF files to DjVu files
gsdjvu contains the source code for a Ghostscript driver that enables to convert PostScript(tm) and Portable Document Format (PDF) electronic document files into DjVu files.
Graphviz: Tools for viewing and interacting with graph diagrams
Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks. Automatic graph drawing has many important applications in software engineering, database and web design, networking, and in visual interfaces for many other domains. Graphviz is open source graph visualization software. It has several main graph layout programs.
Nanocubes: Fast Visualization of Large Spatiotemporal Datasets
Nanocubes are a fast datastructure for in-memory data cubes developed at the Information Visualization department at AT&T Labs – Research. Nanocubes can be used to explore datasets with billions of elements at interactive rates in a web browser, and in some cases it uses sufficiently little memory that you can run a nanocube in a modern-day laptop.
PADS: Processing Arbitrary Data Streams
PADS is a system that simplifies processing ad hoc data sources. Its users can declaratively describe data sources and then use generated tools to understand, parse, translate, and format data.
Rserve: Rserve is a client/server infrastructure allowing the use of R from a large number of languages and environments.
Clients are available for C++, Java, R, Python, Ruby and other languages. It also support a variety of protocols such as HTTP, Websockets, QAP, TLS, ...
Rserve is used as a backbone for many projects including FastRWeb and RCloud. It can also be used for high-performance distributed computing (see Rserve.cluster) as well as computation on distributed storage (see RCassandra).
Sfio: Portable library for performing I/O
Spatial Index Library: Generic main memory and disk based storage managers
UWIN: Unix on Windows 95 and NT Machines
UWIN allows UNIX applications to be built and run on Microsoft Windows 7, Vista, XP, 2000, NT, ME,98, 95 (W7/VI/XP/2K/NT/ME/98/95) with few, if any, changes necessary. UWIN source and binaries are available at http://github.com/att/uwin.
Vcodex: Software package for data transformation
Data transformation platform. This can be used to compress, encrypt, checksum and transcode data. The platform is structured in three layers:
Compressing kennedy.xls from the Canterbury corpus:
WSP: Web Scraping Proxy
Programmers often need to use information on Web pages as input to other programs. This is done by Web Scraping, writing a program to simulate a person viewing a Web site with a browser. It is often hard to write these programs because it is difficult to determine the Web requests necessary to do the simulation. The Web Scraping Proxy (WSP) solves this problem by monitoring the flow of information between the browser and the Web site and emitting Perl code fragments that can be used to write the Web Scraping program. A developer would use the WSP by browsing the site once with a browser that accesses the WSP as a proxy server. He then uses the emitted code as a template to build a Perl program that accesses the site.
Yoix: The Yoix Scripting Language and Interpreter
The Yoix scripting language is a general-purpose programming language that uses syntax and functions familiar to users of C and Java. It is not an object oriented language, but makes use of over 150 object types that provide access to most of the standard Java classes.
dss: dss (data stream scan) is a framework for describing, transforming, reading, querying, and writing streams of record oriented data.
It is implemented as a command and library API. The API is extended by plugins (DLLs / shared libraries) that define data domain specific I/O, type and query functions. dss compares favorably against perl, the typical recourse in the networking community, and against customized C/C++ code written to deal with single domain datasets. Supported data includes Netflow, BGP, HTTP proxy and server logs, OSPF LSA, HTML, Json, and generic flat file formats via XML schemas. Type support includes IPv4 and IPv6 address formatting and longest prefix matching (using the iv library), AS path regular expression matching, second and subsecond time querying and formatting, and numeric types including BCD and IBM floating point. (download)
fastshp: Tools for manipulation of shapefiles, optimized for speed in order to handle very large shapefiles (such as complete TIGER/Line databases).
It supports operations such as geomapping (e.g. location to/from ZIP(ZCTA), County, State, nearest town etc.), polygon thinning (data reduction) or centroid computation.
iPlots eXtreme: High-performance interactive graphics for the analysis of big data.
iPlots provide highly interactive graphics (brushing, linking, direct manipulation, ...) for data anaysis. It leverages GPU acceleration to support visualization of large data. It can be used in conjunction with R or as a stand-alone software.
iPlots: Interactive graphics for data analysis in R
iPlots is a package for the R statistical environment which provides high interaction statistical graphics, written in Java. It offers a wide variety of plots, including histograms, barcharts, scatterplots, boxplots, fluctuation diagrams, parallel coordinates plots and spineplots. All plots support interactive features, such as querying, linked highlighting, color brushing, and interactive changing of parameters.
iv: iv is a fast software implementation of the IPv4 and IPv6 LPM (longest prefix match) algorithms.
The iv algorithm uses interval matching which is about 2X slower than the lpm retrie algorithm for IPv4 addresses. However, for applications that must do both IPv4 and IPv6 LPM, iv may be the better choice because the same iv API may be used for matching addresses of any length (including but not limited to IPv4 and IPv6 addresses.) (download)
jSPaRKy: jSPaRKy v2.0 is a freely-available sentence planner implemented in Java.
To download, click here. Uncompress. Read the included readme. Email us with any questions or suggestions. Check back for updates!
This version of SPaRKy is different from the first one in several ways.
lpm: lpm is a fast software implementation of the IPv4 LPM (longest prefix match) algorithm.
It is available as both a library and standalone command. BGP (Border Gateway Protocol) routers use LPM lookup on a table of IP address prefixes to determine the next hop address for each incoming packet. Routers implement LPM in silicon, but software implementations are still useful for offline analysis. Most published software approaches attempt to minimize memory size and accesses, but often at the expense of complexity. The lpm algorithm uses the AT&T retrie (radix encoded trie, recursive trie) data structure. A retrie has a simple layout and a simple search inner loop. Our timings and memory requirements match or beat the best published algorithms; we also feel that retries have the edge on simplicity. For IPv6 LPM see the iv (interval) library. (download)
snippets: A collection of R tools frequently used with AT&T data, such as plotting data on maps, word clouds, recursive plot layout, smart labels etc.
Twitterscope: Data Management / Text Mining
Text mining software, with the following functionalities
vmalloc: Region Memory Allocator
A memory allocation platform. This provides a uniform interface for memory allocation that extends the well-used malloc/free/realloc interface. Memory allocation is done via general "regions" that can be built from heap memory, shared memory or from other regions. Concurrent accesses from both parallel threads and parallel processes are handled transparently. The data structure and algorithm for doing this is both faster and more space efficient than other known memory allocators.
BoosTexter: A general purpose machine-learning program
BoosTexter is a general purpose machine-learning program based on boosting for building a classifier from text and/or attribute-value data.
ASDT: The AT&T Statistical Dialog Toolkit (ASDT)
The AT&T Statistical Dialog Toolkit (ASDT) enables developers to build spoken dialog systems that track a distribution over multiple dialog states. The engine provided by the toolkit updates this distribution efficiently, in real time, during the dialog. The engine is implemented in Python.
More and more nowadays devices talk back to you. Where before it was common to hear phone dialog systems speaking to you (and understanding), an increasing number of personal devices--laptops, smart phones, GSP navigation systems, and game devices talk to you, too.
A package for speech recognition, language modeling, voice biometrics and building of speech-based services.
Related packages/services include:
In The News