
180 Park Ave - Building 103
Florham Park, NJ
Nick Duffield is a Distinguished Member of Technical Staff and an AT&T Fellow at AT&T Labs-Research, Florham Park, NJ, where he has been since 1995. He previously held post-doctoral and faculty positions in Dublin, Ireland, and Heidelberg, Germany. He received a BA in Natural Sciences in 1982, a MMath (Part III Maths) in 1983 from the University of Cambridge, UK, and a PhD in Mathematical Physics from the University of London, U.K., in 1987. His current research focuses on measurement and inference of networks and traffic. He is a co-inventor of the Smart Sampling technologies that lie at the heart of AT&T's scalable Traffic Analysis Service. Dr. Duffield was Charter Chair of the IETF working group on Packet Sampling. He was an Associate Editor for the IEEE/ACM Transactions on Networking from 2007-2011. He is an IEEE Fellow, and was a co-recipient the Sigmetrics 2012 Test of Time Award, for work in Network Tomography.
Publications
Major research interests (most recent first)
- Trajectory Sampling. Coordinate packet sampling across routers based on hash of invariant packet fields. Now standardized in the IETF.
- Network Tomography.
- Traffic Matrices: inferring traffic matrices from link level traffic aggregates. Deployed in network measurement infrastructure.
- Link Performance: inferring link level loss and delay by correlating end-to-end measurements across a mesh of paths. Support for reporting by transport protocols standardized in the IETF.
- The Hose Model for Virtual Private Networks: service without advance need of a traffic matrix
- Network Measurement Infrastructure: early generation (T3 speed) passive monitors
- Analysis and Measurement of Effective Bandwidths: using
- Statistical Mechanics
- Quantum dynamical semigroups and mean-field theory
- Large deviations and group representations
- Non-equilibrium phase transitions
Test of Time, 2012 ACM SIGMETRICS, for
Network Tomography on General Topologies (2002).
This award, given each year to a previous conference paper whose impact is still felt 10-12 years after publication, cited the paper’s pioneering work in network tomography, particularly its novel methods to estimate delay, packet loss, and other measures of performance.
AT&T Fellow, 2007.
Network measurements sampling, analysis, and inference: Honored for fundamental contributions to sampling, analysis and inference from network measurements that have had broad impact on AT&T and the industry.
IEEE Fellow, 2005.
For contributions to the measurement, analysis and management of telecommunications networks.
Tiresias: Online Anomaly Detection for Hierarchical Operational Network Data
Nicholas Duffield, Jia Wang, Chi Hong, Matthew Caesar
IEEE ICDCS 2012,
2012.
[PDF]
[BIB]
IEEE Copyright
This version of the work is reprinted here with permission of IEEE for your personal use. Not for redistribution. The definitive version was published in 2012. , 2012-06-18
{Operational network data, management data such as customer
care call logs and equipment system logs, is a very
important source of information for network operators to detect
problems in their networks. Unfortunately, there is lack
of efficient tools to automatically track and detect anomalous
events on operational data, causing ISP operators to rely
on manual inspection of this data. While anomaly detection
has been widely studied in the context of network data, operational
data presents several new challenges, including the
volatility and sparseness of data, and the need to perform fast
detection (complicating application of schemes that require
offline processing or large/stable data sets to converge).
To address these challenges, we propose Tiresias, an automated
approach to locating anomalous events on hierarchical
operational data. Tiresias leverages the hierarchical structure
of operational data to identify high-impact aggregates
(e.g., locations in the network, failure modes) likely to be associated
with anomalous events. To accommodate different
kinds of operational network data, Tiresias consists of an online
detection algorithm with low time and space complexity,
while preserving high detection accuracy. We present results
from two case studies using operational data collected at a
large commercial IPTV network operated by a Tier-1 ISP:
customer care calls log and set-top box crashes log. By comparing
with a reference set verified by the ISP�s operational
group, we validate that Tiresias can achieve > 94% accuracy
in locating anomalies. Tiresias also discovers several previously
unknown anomalies in the ISP�s customer care cases,
demonstrating its effectiveness.}

Don’t Let The Negatives Bring You Down: Algorithms for Sampling from Streams of Signed Updates
Graham Cormode, Edith Cohen, Nicholas Duffield
ACM SIGMETRICS,
2012.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM SIGMETRICS, 2012-06-30.
{Random sampling is a powerful tool for working with large data. Queries over the full
dataset are replaced by approximate queries over the smaller (and hence easier to store and manipulate) sample. The sample constitutes a flexible summary that supports a wide class of queries. But in many applications, datasets are modified with time, and it is desirable to update samples without requiring access to the full underlying datasets. In this paper, we introduce and analyze novel techniques for sampling over dynamic data, modeled as a stream of modifications to weights associated with each key.
While sampling schemes designed for stream applications can often readily accommodate positive updates to the dataset, much less is known for the case of negative updates, where weights are reduced or items deleted altogether. We primarily consider the turnstile model of streams, and extend classic schemes to incorporate negative updates. Perhaps surprisingly, the modifications to handle negative updates turn out to be natural and seamless extensions of the well-known original positive update-only algorithms. We show that they produce unbiased estimators, and relate their performance to the behavior of corresponding algorithms on insert-only streams with different parameters. A careful analysis is necessitated, in order to account for the fact that sampling choices for one key now depend on the choices made for other keys.}

Structure Aware Sampling on Data Streams
Edith Cohen, Graham Cormode, Nicholas Duffield
ACM Sigmetrics 2011 Conference,
2011.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM SIGmetrics 2011 Conference , 2011-06-07.
{The massive data streams observed in network monitoring, data processing and
scientific studies are typically too large to store. For many applications over such data, we must obtain compact summaries of the stream. These summaries should allow accurate answering of post hoc queries with estimates which approximate the true answers over the original stream. The data often has an underlying structure which makes certain subset queries, in particular {em range queries}, more relevant than arbitrary subsets. Applications such as access control, change detection, and heavy hitters typically involve subsets that are ranges
or unions thereof.
Random sampling is a natural summarization tool, being easy to implement and flexible to use. Known sampling methods are good for arbitrary queries but fail to optimize for the common case of range queries. Meanwhile, specialized summarization algorithms have been proposed for range-sum queries and related problems. These can outperform sampling giving fixed space resources, but lack
its flexibility and simplicity. Particularly, their accuracy degrades when queries span multiple ranges.
We define new stream sampling algorithms with a smooth and tunable trade-off between accuracy on range-sum queries and arbitrary subset-sum queries. The technical key is to relax requirements on the variance over all subsets to enable better performance on the ranges of interest. This boosts the accuracy on range queries while retaining the prime benefits of sampling, in particular flexibility and accuracy, with tail bounds having optimality guarantees. Our experimental study indicates that structure-aware summaries drastically improve range-sum accuracy with respect to state-of-the-art stream sampling algorithms and outperform deterministic methods on range-sum queries and hierarchical heavy
hitter queries.}

Making Sense of Customer Tickets in Cellular Networks
Yu Jin, Nicholas Duffield, Alexandre Gerber, Patrick Haffner, Wen Hsu, Guy Jacobson, Shobha Venkataraman, Zhi-Li Zhang, Subhabrata Sen
in Proc. IEEE INFOCOM Mini-Conference,
2011.
[PDF]
[BIB]
{Abstract�Effective management of large-scale cellular data
networks is critical to meet customer demands and expectations.
Customer calls for technical support provides direct indication as
to the issues and problems customers encounter. In this paper we
study the customer tickets � free-text recordings and classifications
by customer support agents � collected at a large cellular network
provider, with two inter-related goals: i) to characterize and
understand the major factors which lead to customers to call
and seek support; and ii) to utilize such customer tickets to
help identify potential network problems. For this purpose, we
develop a novel statistical approach to model customer call rates
which account for customer-side factors (e.g., user tenure and
handset types) as well as geo-locations. We show that most calls
are due to customer-side factors and can be well captured by the
model. Furthermore, we also demonstrate that location-specific
deviations from the model provide a good indicator of potential
network-side issues. The latter is corroborated with the detailed
analysis of customer tickets and other independent data sources
(non-ticket customer feedback and network performance data).}

Large-scale App-based Reporting of Customer Problems in Cellular Networks: Potential and Limitations
Yu Jin, Nicholas Duffield, Alexandre Gerber, Patrick Haffner, Wen Hsu, Guy Jacobson, Subhabrata Sen, Shobha Venkataraman, Zhi-Li Zhang
in Proc. ACM SIGCOMM Workshop on Measurements Up the STack (W-MUST),
2011.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM SIGCOMM Workshop on Measurements Up the STack (Wâ€MUST) , 2011-08-19.
{Multidimensional distributions are often used in data min- ing to describe and summarize different features of large datasets. It is natural to look for distinct classes in such datasets by clustering the data. A common approach entails the use of methods like k-means clustering. However, the k-means method inherently relies on the Euclidean metric in the embedded space and does not account for additional topology underlying the distribution.
In this paper, we propose using Earth Mover Distance (EMD) to compare multidimensional distributions. For a n-bin histogram, the EMD is based on a solution to the transportation problem with time complexity O(n3 log n). To mitigate the high computational cost of EMD, we pro- pose an approximation that reduces the cost to linear time.
Other notions of distances such as the information theo- retic Kullback-Leibler divergence and statistical χ2 distance, account only for the correspondence between bins with the same index, and do not use information across bins, and are sensitive to bin size. A cross-bin distance measure like EMD is not affected by binning differences and meaningfully matches the perceptual notion of “nearness”.
Our technique is simple, efficient and practical for clus- tering distributions. We demonstrate the use of EMD on a practical application of clustering over 400,000 anonymous mobility usage patterns which are defined as distributions over a manifold. EMD allows us to represent inherent re- lationships in this space. We show that EMD allows us to successfully cluster even sparse signatures and we compare the results with other clustering methods. Given the large size of our dataset a fast approximation is crucial for this application.}

Efficient stream sampling for variance-optimal estimation of subset sums
Edith Cohen, Nicholas Duffield, Carsten Lund, Mikkel Thorup, Haim Kaplan
SIAM Journal on Computing,
2011.
[PDF]
[BIB]
SIAM (Society fir Industrial and Applied Mathematics) Copyright
The definitive version was published in SIAM Journal on Computing. , 2011-06-01
{From a high volume stream of weighted items, we want to maintain a generic sample of a certain limited size k that we can later use to estimate the total weight of arbitrary subsets. This is the classic context of on-line reservoir sampling, thinking of the generic sample as a reservoir. We present an efficient reservoir sampling scheme, VAROPTk, that dominates all previous schemes in terms of estimation quality. VAROPTk provides variance optimal unbiased estimation of subset sums. More precisely, if we have seen n items of the stream, then for any subset size m, our scheme based on k samples minimizes the average variance over all subsets of size m. In fact, the optimality is against any off-line scheme with
k samples tailored for the concrete set of items seen. In addition to optimal average variance, our scheme provides tighter worst-case bounds on the variance of particular subsets than previously possible. It is efficient, handling each new item of the stream in O(log k) time. Finally, it is particularly well suited for
combination of samples from different streams in a distributed setting.
}

NEVERMIND, the Problem Is Already Fixed: Proactively Detecting and Troubleshooting Customer DSL Problems
Yu Jin, Nicholas Duffield, Alexandre Gerber, Patrick Haffner, Subhabrata Sen, Zhi-Li Zhang
in Proc. of ACM CoNext,
2010.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2010. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM CoNext 2010 , 2010-11-30.
{Traditional DSL troubleshooting solutions are reactive, relying
mainly on customers to report problems, and tend to
be labor-intensive, time consuming, prone to incorrect resolutions
and overall can contribute to increased customer
dissatisfaction. In this paper, we propose a proactive approach
to facilitate troubleshooting customer edge problems
and reducing customer tickets. Our system consists of: i) a
ticket predictor which predicts future customer tickets; and
ii) a trouble locator which helps technicians accelerate the
troubleshooting process during field dispatches. Both components
infer future tickets and trouble locations based on
existing sparse line measurements, and the inference models
are constructed automatically using supervised machine
learning techniques. We propose several novel techniques to
address the operational constraints in DSL networks and to
enhance the accuracy of NEVERMIND. Extensive evaluations
using an entire years worth of customer ticket and measurement
data from a large network show that our method
can predict thousands of future customer tickets per week
with high accuracy and reduce significantly reduce the time
and effort for diagnosing these tickets. This is beneficial as it
has the effect of both reducing the number of customer care
calls and improving customer satisfaction.}

FlowRoute: Inferring Forwarding Table Updates Using Passive Flow-level Measurements
Lee Breslau, Cheng Ee, Alexandre Gerber, Subhabrata Sen, Amogh Dhamdhere, Nicholas Duffield, Carsten Lund
in Proc. of ACM Internet Measurement Conference (IMC),
2010.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2010. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Internet Measurement Conference , 2010-11-01.
{The reconvergence of routing protocols in response to changes
in network topology can impact application performance.
While improvements in protocol specification and implementation have significantly reduced reconvergence times,
increasingly performance-sensitive applications continue to
raise the bar for these protocols. As such, monitoring the
performance of routing protocols remains a critical activity
for network operators. We design FlowRoute, a tool based
on passive data plane measurements that we use in conjunction with control plane monitors for offline debugging and
analysis of forwarding table dynamics. We discuss practical
constraints that affect FlowRoute, and show how they can
be addressed in real deployment scenarios. As an application of FlowRoute, we study forwarding table updates by
backbone routers at a tier-1 ISP. We detect interesting behavior such as delayed forwarding table updates and routing
loops due to buggy routers � confirmed by network opera-
tors � that are not detectable using traditional control plane
monitors.}

Trajectory Engine: A backend for trajectory sampling
Nicholas Duffield, Alexander Gerber, Matthias Grossglauser
In Proc. IEEE Network Operations and Management Symposium (NOMS),
2002.
[PDF]
[BIB]
IEEE Copyright
This version of the work is reprinted here with permission of IEEE for your personal use. Not for redistribution. The definitive version was published in Proc. IEEE Network Operations and Management Symposium (NOMS), 2002. , 2002-04-19
{The management of communications networks increasingly requires detailed knowledge of network usage, acquired by direct measurement. We report on the design and implementation of a backend system for Trajectory Sampling, a method for the consistent sampling of packets in transmission across a network domain. This Trajectory Engine collects trajectory samples and stores them after appropriate preprocessing. It provides a querying and visualization tool to aid in traffic engineering and troubleshooting tasks. }
Network tomography on general topologies
Nicholas Duffield, Tian Bu, Francesco Lo Presti, Don Towsley
Proceedings ACM Sigmetrics 2002,
2002.
[PDF]
[BIB]
{In this paper we consider the problem of inferring link-level loss rates from end-to-end multicast measurements taken from a collection of trees. We give conditions under which loss rates are identifiable on a specified set of links. Two algorithms are presented to perform the link-level inferences for those links on which losses can be identified. One, the {em minimum variance weighted average (MVWA) algorithm} treats the trees separately and then averages the results. The second, based on {em expectation-maximization (EM)} merges all of the measurements into one computation. Simulations show that EM is slightly more accurate than MVWA, most likely due to its more efficient use of the measurements. We also describe extensions to the inference of link-level delay, inference from end-to-end unicast measurements, and inference when some measurements are missing. To appear in Proceedings ACM Sigmetrics 2002 }
System And Method For Customized Voice Response,
Tue May 14 17:26:22 EDT 2013
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating an accent source. A system practicing the method collects data associated with customer specific services, generates country-specific or dialect-specific weights for each service in the customer specific services list, generates a summary weight based on an aggregation of the country-specific or dialect-specific weights, and sets an interactive voice response system language model based on the summary weight and the country-specific or dialect-specific weights. The interactive voice response system can also change the user interface based on the interactive voice response system language model. The interactive voice response system can tune a voice recognition algorithm based on the summary weight and the country-specific weights. The interactive voice response system can adjust phoneme matching in the language model based on a possibility that the speaker is using other languages.
Scalable Traffic Classifier And Classifier Training System,
Tue Nov 13 16:12:21 EST 2012
A traffic classifier has a plurality of binary classifiers, each associated with one of a plurality of calibrators. Each calibrator trained to translate an output score of the associated binary classifier into an estimated class probability value using a fitted logistic curve, each estimated class probability value indicating a probability that the packet flow on which the output score is based belongs to the traffic class associated with the binary classifier associated with the calibrator. The classifier training system configured to generate a training data based on network information gained using flow and packet sampling methods. In some embodiments, the classifier training system configured to generate reduced training data sets, one for each traffic class, reducing the training data related to traffic not associated with the traffic class.
Multicast-Based Inference Of Temporal Loss Characteristics In Packet Data Networks,
Tue Jul 31 16:11:13 EDT 2012
Disclosed are method and apparatus for characterizing the temporal loss characteristics of a packet data network by multicast-based inference. Multicast probes are transmitted from a source node to a plurality of receiver nodes, which record the arrivals of the multicast probes. From the aggregate data comprising recorded arrivals of the end-to-end paths from the source node to each receiver node, temporal loss characteristics of individual links within the network may be calculated. In a network with a tree topology, the complexity of calculations may be reduced through a process of subtree partitioning.
Method For Summarizing Data In Unaggregated Data Streams,
Tue Jun 05 16:10:40 EDT 2012
A method for producing a summary A of data points in an unaggregated data stream wherein the data points are in the form of weighted keys (a, w) where a is a key and w is a weight, and the summary is a sample of k keys a with adjusted weights w.sub.a. A first reservoir L includes keys having adjusted weights which are additions of weights of individual data points of included keys and a second reservoir T includes keys having adjusted weights which are each equal to a threshold value .tau. whose value is adjusted based upon tests of new data points arriving in the data stream. The summary combines the keys and adjusted weights of the first reservoir L with the keys and adjusted weights of the second reservoir T to form the sample representing the data stream upon which further analysis may be performed. The method proceeds by first merging new data points in the stream into the reservoir L until the reservoir contains k different keys and thereafter applying a series of tests to new arriving data points to determine what keys and weights are to be added to or removed the reservoirs L and T to provide a summary with a variance that approaches the minimum possible for aggregated data sets. The method is composable, can be applied to high speed data streams such as those found on the Internet, and can be implemented efficiently.
System And Method For Inferring Wireless Trajectories In A Cellular Telephone Network,
Tue Feb 21 16:09:23 EST 2012
A device includes a processor configured to determine a number of users in each of a plurality of wireless telephone cells of a trajectory in a wireless telephone network. The processor is also configured to determine handoff data between each adjacent pair of the wireless telephone cells, and to determine a first number of users traveling along the trajectory in the wireless telephone network while on a telephone call. The processor also calculates a total number of users associated with the trajectory in the wireless telephone network based on the handoff data between each adjacent pair of the wireless telephone cells, and based on the first number of users traveling along the trajectory while on the telephone call.
Method And Apparatus For Providing A Measurement Of Performance For A Network,
Tue Dec 06 16:06:42 EST 2011
A method and an apparatus for providing a measurement of performance for a network are disclosed. For example, the method sends a plurality of multi-objective probes on a path, and receives one or more of said plurality of multi-objective probes for the path. The method then determines a plurality of performance measurements.
System And Method For Spatially Consistent Sampling Of Flow Records At Constrained, Content-Dependent Rates,
Tue Nov 22 16:06:37 EST 2011
Disclosed herein are systems, computer-implemented methods, and computer-readable media for sampling network traffic. The method includes receiving a desired quantity of flow record to sample, receiving a plurality of network flow record each summarizing a network flow of packets, calculating a hash for each flow record of based on one or more invariant part of a respective flow, generating a quasi-random number from the calculated hash for each respective flow record, generating a priority from the calculated hash for each respective flow record, and sampling exactly the desired quantity of flow records, selecting flow records having a highest priority first. In one aspect, the method further partitions the plurality of flow records into groups based on flow origin and destination, generates an individual priority for each partitioned group, and separately samples exactly the desired quantity of flow records from each partitioned group, selecting flows having a highest individual priority first.
Optimal Combination Of Sampled Network Measurements,
Tue Sep 27 16:06:11 EDT 2011
Two regularized estimators that avoid the pathologies associated with variance estimation are disclosed. The regularized variance estimator adds a contribution to estimated variance representing the likely error, and hence ameliorates the pathologies of estimating small variances while at the same time allowing more reliable estimates to be balanced in the convex combination estimator. The bounded variance estimator employs an upper bound to the variance which avoids estimation pathologies when sampling probabilities are very small.
Variance-Optimal Sampling-Based Estimation Of Subset Sums,
Tue Aug 23 16:06:03 EDT 2011
The present invention relates to a method of obtaining a generic sample of an input stream. The method is designated as VAROPT.sub.k. The method comprises receiving an input stream of items arriving one at a time, and maintaining a sample S of items i. The sample S has a capacity for at most k items i. The sample S is filled with k items i. An n.sup.th item i is received. It is determined whether the n.sup.th item i should be included in sample S. If the n.sup.th item i is included in sample S, then a previously included item i is dropped from sample S. The determination is made based on weights of items without distinguishing between previously included items i and the n.sup.th item i. The determination is implemented thereby updating weights of items i in sample S. The method is repeated until no more items are received.
Method And Apparatus For Providing Performance Measurement For A Network Tunnel,
Tue Aug 23 16:06:00 EDT 2011
A method and apparatus for providing performance measurements on network tunnels in packet networks are disclosed. For example, the method establishes two tunnels between a first measurement host and a first router, and establishes a tunnel between the first router and a second measurement host. The method also establishes a multicast group having a plurality of members, and sends one or more packets addressed to the multicast group from the first measurement host. The method measures the frequencies of directly and/or indirectly received responses from the plurality of members of the multicast group, and provides a plurality of estimated values for a plurality of packet transmission rates from measurement of the frequencies for one or more of said tunnels.
Method For Implementing And Reporting One-Way Network Measurements,
Tue May 31 16:05:19 EDT 2011
A method is disclosed for implementing and reporting network measurements between a source of probe packets and an element, such as a router. The invention exploits commonly implemented features on commercial elements. By exploiting these features, the expense of deploying special purpose measurement devices can be avoided. In one aspect of the invention, a plurality of probe packets is transmitted in a packet network with each of the probe packets having the same key and the same aggregation characteristic. A report is then received from an instructionless element regarding the plurality of probe packets, thereby enabling measurement of a parameter of the packet network.
Method And Apparatus For One-Way Passive Loss Measurements Using Sampled Flow Statistics,
Tue Apr 12 16:04:51 EDT 2011
A packet loss estimation technique is disclosed that utilizes the sampled flow level statistics that are routinely collected in operational networks, thereby obviating the need for any new router features or measurement infrastructure. The technique is specifically designed to handle the challenges of sampled flow-level aggregation such as information loss resulting from packet sampling, and generally comprises: receiving a first record of sampled packets for a flow from a first network element; receiving a second record of sampled packets for the flow from a second network element communicating with the first network element; correlating sampled packets from the flow at the first network element and the second network element to a measurement interval; and estimating the packet loss using a count of the sampled packets correlated to the measurement interval.
Method And Apparatus For Detection Of Hierarchical Heavy Hitters,
Tue Mar 01 16:04:34 EST 2011
An efficient streaming method and apparatus for detecting hierarchical heavy hitters from massive data streams is disclosed. In one embodiment, the method enables near real time detection of anomaly behavior in networks.
Sampling And Analyzing Packets In A Network,
Tue Dec 14 15:05:20 EST 2010
The preferred embodiments of the present invention can include sampling packets transmitted over a network based on the content of the packets. If a packet is sampled, the sampling unit can add one or more fields to the sampled packet that can include a field for a number of bytes contained in the packet, a packet count, a flow count, a sampling type, and the like. The sampled packets can be analyzed to discern desired information from the packets. The additional fields that are added to the sampled packets can be used during the analysis.
Algorithms And Estimators For Summarization Of Unaggregated Data Streams,
Tue Jul 27 15:04:14 EDT 2010
The invention relates to streaming algorithms useful for obtaining summaries over unaggregated packet streams and for providing unbiased estimators for characteristics, such as, the amount of traffic that belongs to a specified subpopulation of flows. Packets are sampled from a packet stream and aggregated into flows and counted by implementation of: (a) Adaptive Sampled NetFlow (ANF), and adjusted weight (A.sup.A.sup.NF) of a flow (f) is calculated as follows: A.sup.A.sup.NF(f)=i(f)/p'; i(f) being the number of packets counted for a flow f, and p' being the sampling rate at end of a measurement period; or (b) Adaptive Sample-and-Hold (ASH), and adjusted weight (A.sup.A.sup.SH) of a flow (f) is calculated as follows: A.sup.A.sup.SH(f)=i(f)+(1-p')/p'; i(f) being the number of packets counted for a flow f, and p' being the sampling rate at end of a measurement period.
Algorithms And Estimators For Summarization Of Unaggregated Data Streams,
Tue Jun 29 15:04:09 EDT 2010
The invention relates to streaming algorithms useful for obtaining summaries over unaggregated packet streams and for providing unbiased estimators for characteristics, such as, the amount of traffic that belongs to a specified subpopulation of flows. Packets are sampled from a packet stream and aggregated into flows and counted by implementation of Adaptive Sample-and-Hold (ASH) or Adaptive NetFlow (ANF), adjusting the sampling rate based on a quantity of flows to obtain a sketch having a predetermined size, the sampling rate being adjusted in steps; and transferring the count of aggregated packets from SRAM to DRAM and initializing the count in SRAM following adjustment of the sampling rate.
Statistical, Signature-Based Approach To IP Traffic Classification,
Tue Feb 09 15:03:27 EST 2010
A signature-based traffic classification method maps traffic into preselected classes of service (CoS). By analyzing a known corpus of data that clearly belongs to identified ones of the preselected classes of service, in a training session the method develops statistics about a chosen set of traffic features. In an analysis session, relative to traffic of the network where QoS treatments are desired (target network), the method obtains statistical information relative to the same chosen set of features for values of one or more predetermined traffic attributes that are associated with connections that are analyzed in the analysis session, yielding a statistical features signature of each of the values of the one or more attributes. A classification process then establishes a mapping between values of the one or more predetermined traffic attributes and the preselected classes of service, leading to the establishment of QoS treatment rules.
Adaptive Defense Against Various Network Attacks,
Tue Sep 08 16:08:00 EDT 2009
An apparatus for optimizing a filter based on detected attacks on a data network includes an estimation means and an optimization means. The estimation means operates when a detector detects an attack and the detector transmits an inaccurate attack severity. The estimation means determines an accurate attack severity. The optimization means adjusts a parameter and the parameter is an input to a filter.
Traffic Matrix Estimation Method And Apparatus,
Tue Aug 11 16:07:48 EDT 2009
A method and apparatus for the estimation of traffic matrices in a network are disclosed.
Optimal Combination Of Sampled Measurements,
Tue May 19 16:07:00 EDT 2009
Two regularized estimators that avoid the pathologies associated with variance estimation are disclosed. The regularized variance estimator adds a contribution to estimated variance representing the likely error, and hence ameliorates the pathologies of estimating small variances while at the same time allowing more reliable estimates to be balanced in the convex combination estimator. The bounded variance estimator employs an upper bound to the variance which avoids estimation pathologies when sampling probabilities are very small.
Consistent Sampling For Network Traffic Measurement,
Tue Mar 24 16:07:00 EDT 2009
Traffic measurement should make it possible to obtain the spatial flow of traffic through the domain, i.e., the paths or trajectories followed by packets between any ingress and egress point of the domain. A method of sampling packet trajectories in a packet switching network allows the direct inference of traffic flows through a measurement domain by observing the trajectories of a subset of all packets traversing the network. A method which assumes that the measurement domain does not change comprises the steps of selecting packets for sampling in accordance with a sampling function of the packet content and generating a practically unique label for each sampled packet. The method does not rely on routing state, its implementation cost is small, and the measurement reporting traffic is modest and can be controlled precisely. Using the same hash function will yield the same sample set of packets in the entire domain, and enables us to reconstruct packet trajectories. An alternate embodiment which assumes no constraints and that the measurement domain may change comprises the steps of applying a sampling function and altering an invariant bit position as a signaling flag in each packet selected for sampling.
Methods and apparatus for detection of hierarchical heavy hitters,
Tue Oct 14 18:13:03 EDT 2008
An efficient streaming method and apparatus for detecting hierarchical heavy hitters from massive data streams is disclosed. In one embodiment, the method enables near real time detection of anomaly behavior in networks.
Apparatus for size-dependent sampling for managing a data network,
Tue Nov 20 18:12:28 EST 2007
The present invention provides apparatus for sampling data flows in a data network in order to estimate a total data volume in the network. Sampling the data flows in the data network reduces the network resources that must be expended by the network to support the associated activity. The present invention enables the service provider of the data network to control sampled volumes in relation to the desired accuracy. The control can be either static or can be dynamic for cases in which the data volumes are changing as a function of time.
Traffic matrix estimation method and apparatus,
Tue Nov 06 18:12:23 EST 2007
A method and apparatus for the estimation of traffic matrices in a network are disclosed. Mechanisms are disclosed for measuring traffic volume from a plurality of ingress points to a plurality of egress points in a large scanl network, such as an IP backbone network. The traffic matrix is advantageously inferred from widely available link load measurements such as SNMP data.
Virtual private network,
Tue Mar 27 17:08:41 EDT 2007
The invention provides apparatus and methods for a Virtual Private Network (VPN) in a network that offers a simple user interface for efficient utilization of network resources. The VPN is defined for a specified set of endpoints each of which is associated with a single hose. A hose provides access to the VPN through an access point which may be a node of the network, for example. The hose is a single interface to the VPN for communication to all other endpoints of the VPN. The VPN achieves network resource allocation efficiency by exploiting resource sharing possibilities via multiplexing routing paths between endpoints and dynamic resource allocation techniques that permit real time resource allocation resizing. When a VPN is established with a VPN service provider, the routing paths between the endpoints of the VPN is optimized for multiplexing opportunities so that resource allocations between nodes along routing paths within the IP network is reduced to a minimum.
Method and apparatus for size-dependent sampling for managing a data network,
Tue Jul 18 18:11:25 EDT 2006
The present invention provides a method and apparatus for sampling data flows in a data network in order to estimate a total data volume in the network. Sampling the data flows In the data network reduces the network resources that must be expended by the network to support the associated activity. The present Invention enables the service provider of the data network to control sampled volumes in relation to the desired accuracy. The control can be either static or can be dynamic for cases in which the data volumes are changing as a function of time
Virtual Private Network,
Tue Jun 28 17:08:39 EDT 2005
The invention provides apparatus and methods for a Virtual Private Network (VPN) in a network that offers a simple user interface for efficient utilization of network resources. The VPN is defined for a specified set of endpoints each of which is associated with a single hose. A hose provides access to the VPN through an access point which may be a node of the network, for example. The hose is a single interface to the VPN for communication to all other endpoints of the VPN. The VPN achieves network resource allocation efficiency by exploiting resource sharing possibilities via multiplexing routing paths between endpoints and dynamic resource allocation techniques that permit real time resource allocation resizing. When a VPN is established with a VPN service provider, the routing paths between the endpoints of the VPN is optimized for multiplexing opportunities so that resource allocations between nodes along routing paths within the IP network is reduced to a minimum.
Consistent sampling for network traffic measurement,
Tue Mar 29 18:10:21 EST 2005
Traffic measurement should make it possible to obtain the spatial flow of traffic through the domain, i.e., the paths or trajectories followed by packets between any ingress and egress point of the domain. A method of sampling packet trajectories in a packet switching network allows the direct inference of traffic flows through a measurement domain by observing the trajectories of a subset of all packets traversing the network. A method which assumes that the measurement domain does not change comprises the steps of selecting packets for sampling in accordance with a sampling function of the packet content and generating a practically unique label for each sampled packet. The method does not rely on routing state, its implementation cost is small, and the measurement reporting traffic is modest and can be controlled precisely. Using the same hash function will yield the same sample set of packets in the entire domain, and enables us to reconstruct packet trajectories. An alternate embodiment which assumes no constraints and that the measurement domain may change comprises the steps of applying a sampling function and altering an invariant bit position as a signaling flag in each packet selected for sampling.
Fair queuing system with adaptive bandwidth redistribution,
Tue Sep 17 18:08:28 EDT 2002
Apparatus for routing packets in a communication network comprises a plurality of per-connection queues, each queue established for receiving packets from a respective source and temporarily storing received packets before routing to a particular destination; a weighted fair-queuing scheduler for servicing packets from each of the plurality of per-connection queues at guaranteed pre-allocated rates; a sensing device for sensing a presence or absence of packets in queues, the absence of packets in queues indicating availability of excess bandwidth; and, a state dependent scheduler for redistributing excess bandwidth upon sensing of queues absent packets, the state dependent scheduler servicing those queues in accordance with a state variable corresponding to a performance property of the queues, wherein delay and isolation properties for routing packets of respective queues in weighted fair-queuing is preserved.
Method and apparatus for smoothing and multiplexing video data flows,
Tue Oct 30 18:07:15 EST 2001
A method and apparatus provide a smoothing and rate adaptation algorithm to facilitate the flow of video data, maintaining video quality while avoiding potentially harmful buffering delays. The invention uses a smoothing interval to determine a rate to request for allocation. The invention also adapts the encoding rate in relation to a target delay for a source buffer.