
180 Park Ave - Building 103
Florham Park, NJ
System and method for storage media group parity protection,
May 21, 2002
A system and method for storage medium group parity protection stores data files and related parity information asynchronously on an array of storage media. Data files can be stored asynchronously, or synchronously in stripes as in RAIT technology, but related parity information is stored asynchronously with respect to the data files. Regions of the storage media are preferably organized into protection groups for which parity information is generated. Parity information is generated on line as data files are stored and maintained in active memory. Once a protection group is filled, the parity information is migrated to more permanent backup storage. As one example, regions of an array of N storage media can constitute a protection group, and once the regions in the protection group are filled with data, parity data for the protection group is migrated from active memory to more permanent backup storage.
Method and apparatus for packet analysis in a network,
November 11, 2008
A method and system for monitoring traffic in a data communication network and for extracting useful statistics and information is disclosed. In accordance with an embodiment of the invention, a network interface card has a run-time system and one or more processing blocks executing on the network interface. The run-time system module feeds information derived from a network packet to the processing modules which process the information and generate output such as condensed statistics about the packets traveling through the network.
Method and apparatus for loading data into a cube forest data structure,
December 25, 2001
A device and method is disclosed for loading data into and updating a data structure known as a cube forest for use in a batch-load-then-read-intensively system. The device and method perform loading and updating functions efficiently. Hierarchically split cube forests provide a method for efficiently duplicating information, and can be optimized to reduce update and storage costs. Cube forests including hierarchically split cube forests are most appropriate for read- intensive, update-rarely-and-in-large-batches multidimensional applications in an off-the-shelf (low cost) hardware environment. A method and an apparatus for loading data into and updating a cube forest are disclosed herein.
Method and apparatus for packet analysis in a network,
January 16, 2007
A method and system for monitoring traffic in a data communication network and for extracting useful statistics and information is disclosed. In accordance with an embodiment of the invention, a network interface card has a run-time system and one or more processing blocks executing on the network interface. The run-time system module feeds information derived from a network packet to the processing modules which process the information and generate output such as condensed statistics about the packets traveling through the network.
Method and apparatus for analyzing co-evolving time sequences,
April 25, 2000
An analyzer system that analyzes a plurality of co-evolving time sequences to, for example, perform correlation or outlier detection on the time sequences. The plurality of co-evolving time sequences comprise a delayed time sequence and one or more known time sequences. A goal is to predict the delayed value given the available information. The plurality of time sequences have a present value and (N-1) past values, where N is the number of samples (time-ticks) of each time sequence. The analyzer system receives the plurality of co-evolving time sequences and determines a window size (w). The analyzer then assigns the delayed time sequence as a dependent variable and the present value of a subset of the known time sequences, and the past values of the subset of known time sequences and the delayed time sequence, as a plurality of independent variables. Past values delayed by up to w steps are considered. The analyzer then forms an equation comprising the dependent variable and the independent variables, and then solves the equation using a least squares method. The delayed time sequence is then determined using the solved equation.
Method and apparatus for querying a cube forest data structure,
July 23, 2002
A device and method is disclosed for using a data structure known as a cube forest for use in a batch-load-then-read-intensively system. The device and method significantly improve the time to execute a bit vector query. Hierarchically split cube forests provide a method for efficiently duplicating information, and can be optimized to reduce update and storage costs. Cube forests including hierarchically split cube forests are most appropriate for read-intensive, update-rarely-and-in-large-batches multidimensional applications in an off-the-shelf (low cost) hardware environment. A method and an apparatus for querying a cube forest for aggregates are disclosed herein.
Query-aware sampling of data streams,
May 19, 2009
A system, method and computer-readable medium provide for assigning sampling methods to each input stream for arbitrary query sets in a data stream management system. The method embodiment comprises splitting all query nodes in a query directed acyclic graph (DAG) having multiple parent nodes into sets of independent nodes having a single parent, computing a grouping set for every node in each set of independent nodes, reconciling each parent node with each child node in each set of independent node, reconciling between multiple child nodes that share a parent node and generating a final grouping set for at least one node describing how to sample an input stream for that node.
System and method for storage media group parity protection,
September 11, 2001
A system and method for storage medium group parity protection stores data files and related parity information asynchronously on an array of storage media. Data files can be stored asynchronously, or synchronously in stripes as in RAIT technology, but related parity information is stored asynchronously with respect to the data files. Regions of the storage media are preferably organized into protection groups for which parity information is generated. Parity information is generated on line as data files are stored and maintained in active memory. Once a protection group is filled, the parity information is migrated to more permanent backup storage. As one example, regions of an array of N storage media can constitute a protection group, and once the regions in the protection group are filled with data, parity data for the protection group is migrated from active memory to more permanent backup storage.
Coarse Indexes for a Data Warehouse,
April 10, 2001
A coarse database index, and system and method of use therefor, that will quickly indicate which data partitions of a table contain a given key. Once the target data partitions are located, the exact record locations can be found using traditional indexes. The coarse indexes take little space, can be updated quickly, and searched quickly. The coarse index is in conjunction with a database including a plurality of data partitions. Each data partition includes data, including a plurality of key values of at least one key, and at least one dense index referencing the data. The coarse index indexing the plurality of key values according to data partitions containing each key value. The coarse index includes a first bitmap, which is preferably arranged in key value major format. The coarse index may also include a second bitmap, which is preferably arranged in data partition major format. The second bitmap may be transformed from data partition major format to key value major format. The first and second bitmap partitions may be compressed.
Fault-Tolerant Storage System,
April 17, 2001
The present invention is a storage system, and method of operation thereof, which provides improved performance over standard RAID-5 without increasing vulnerability to single-disk drive failures. The storage system comprises a processor and a plurality of data storage devices, coupled to the processor, operable to store a plurality of data stripes, each data stripe comprising a plurality of data blocks and a parity block, each data storage device operable to store one data block or the parity block of each data stripe. The storage system stores a dirty stripe bit vector of a data stripe. When an update to a data block in the data stripe is received, an image of the data block as it was when the dirty stripe bit vector was generated is stored. The data block is updated and an image of the updated data block is stored. When a failure of one of the plurality of data storage devices is detected, a bitwise exclusive-OR of the image of the data block as it was when the dirty stripe bit vector was generated and the image of the updated data block to form an intermediate result is generated. The parity block of the data stripe is read and a bitwise exclusive-OR of the intermediate result and the parity block is generated. The generated parity block is written and a parity rebuild is performed on the data stripe using the new parity block.
Method and system for squashing a large data set,
March 25, 2003
Apparatus and method for summarizing an original large data set with a representative data set. The data elements in both the original data set and the representative data set have the same variables, but there are significantly fewer data elements in the representative data set. Each data element in the representative data set has an associated weight, representing the degree of compression. There are three steps for constructing the representative data set. First, the original data elements are partitioned into separate bins. Second, moments of the data elements partitioned in each bin are calculated. Finally, the representative data set is generated by finding data elements and associated weights having substantially the same moments as the original data set.