A Call for More Energy-Efficient Apps

by: Staff with Alexandre Gerber, Subhabrata Sen, Oliver Spatscheck, Thu Apr 07 16:47:00 EDT 2011


Users sometimes notice that apps drain the battery quickly, some twice as fast as other apps doing similar tasks. Other apps can be slow to load pages or respond to input.

Why do these problems occur? And how is it that some apps perform better than others?

Answers to these questions for a long time were not clear. With developers focused on the app software, device-makers on the device, and network engineers on the (radio) network, information was fragmented and siloed, each group knowledgeable about its domain but not about the others.

To finally resolve the issues, AT&T researchers working with colleagues from the University of Michigan undertook an in-depth, comprehensive investigation of the end-to-end data transmission paths, ultimately discovering the source of the problems to be in the complex interactions between the device and the cellular network, interactions that are especially hard to see, given the layered nature of the network architecture that intentionally hides lower-level protocols from developers working in the application layer.


The layered network architecture is designed to hide implementation details at each layer, making it possible to build complex systems, but also obscuring the source of problems.

This article summarizes how researchers were finally able to understand the nature of the interactions and build an entirely new technology that makes them visible, allowing researchers to diagnose specific inefficiencies of individual apps. One popular app was found to be using 40% of its power consumption to transmit 0.2% of its data. 

And in discovering the inefficiencies and their causes, researchers also identified solutions that turn out to be surprisingly easy to implement.


The specific characteristics of the cellular network

In theory developers should be able to develop software without regard to the underlying physical layer or network. But the reality is that the cellular environment differs fundamentally from the traditional wired environment of PCs and the desktop that many developers are familiar with.

The cellular network is a high-latency, limited-resource environment.

While the wired environment can count on fast Internet connections and bountiful resources, the cellular network is characterized by constrained resources and latencies.

The device is constrained by a small battery (network connections require significant battery power), and the cellular network by a set amount of spectrum capacity that must be shared among a potentially large number of network users. At times of heavy demand, the cell tower can become overloaded and must share capacity among several devices.




The cellular network is a limited-resource, high-latency environment.


Both the limited resources on the device (battery power) and on the cellular network (spectrum) have to be carefully managed and allocated.

The network conserves capacity by opening connections only when needed. But the overhead in opening a connection—the exchange of control messages between network and device, the calculations needed to determine whether sufficient capacity is available—introduces a delay, or latency, of between one and two seconds. Thus the necessity of resource allocation is itself a source of latency on the cellular network (there are others).


The different energy states of a 3G (UMTS) device

The device conserves battery power by maintaining different energy states, depending on whether a network connection is needed or not.

For a typical 3G (UMTS) network, there are three main energy states: An idle state (IDLE) that consumes very little power and is the default state when no network connection is needed. Full-power state (or DCH) for transmitting data at a high rate to and from the device (download speeds of around 7.1 Mbits/sec); it consumes significant battery energy. And an intermediate state (FACH) where the power consumption is roughly 50 percent of that of full-power mode. In the half-power FACH state, the device can only transmit user data through shared low-speed channels that are typically less than 15kbps.

Each promotion entails a delay.

Energy states are characterized not only by battery consumption, but by utilization of network resources. In fact, the app and network together negotiate the proper energy state for the device. When the device needs to transmit data, it signals the network and the network allocates a dedicated channel; or if data is being transmitted to the device, the network alerts the device. The culmination of this process is a promotion of the device to full power so it can transmit data.

Even when the device is at half power, network resources are still being consumed, though at a fraction of what's needed for full power (in the half-power state, several devices may share the same low-bandwidth channel). When the device is at idle, no network resources are being consumed.

There is thus a direct relationship between battery power consumption and network resource utilization.


The implications of a state machine for an app

What determines the current energy state of the device (and the corresponding utilization of network resources) is a behavioral model called a state machine. The state machine, which does not exist in the wired environment, is part of all wireless networks (3G, UMTS, 2G, LTE, Wi-Fi), though the state machine on each network operates differently (the number of states might be different, for example). This article uses the specific example of a typical UMTS network and its radio resource control (RRC) state machine.

The impact of the state machine on an app is probably most felt in how the state machine parameters are set to promote or demote the device to a higher or lower power state. Both promotions and demotions carry a cost.

For promotions, the cost is in the delay as control messages are exchanged between the device and network. This delay—about 2 seconds for idle-->full and about 1 1/2 seconds for half-power-->full—is often noticeable to users since the app is not responsive to user input during this time.

Once in full-power mode, there is no further promotion delay (just the typically small, millisecond delays as data is requested and then received from remote servers). But it's not good design to keep the device at full power for too long; it consumes battery power at a high rate and monopolizes channels that may be needed for other network users. For this reason, the state machine calls for the device to be demoted to the next lower energy state as soon as a data transmission ends. But when is that?

Is a pause in data the end of data, or just an interval between data transmissions? Because of the uncertainty, an inactivity timer is used to trigger the demotion.


 Promotions (triggered by data transmissions) are associated with a delay, and demotions (triggered by timers) entail wasteful tail times that consume resources even though no data is being transmitted.


During this time-out period, called a tail time, the device is still consuming battery resources at a high rate and monopolizing network resources even though no data is being transmitted or received.

Fast response or resource conservation? Limited resources often force such tradeoffs.

Determining the optimum tail time is not easy, involving complex tradeoffs that directly affect the app user and network users.

On a typical 3G network, the tail time from half-power to idle is conservatively set at 12 seconds, a long time to be consuming resources when no data is being transferred. The tail could be reduced (and some networks do use a shorter tail time than others), but there is a tradeoff: an increased chance that a promotion, and its attendant delay, will be needed if the demotion occurs prematurely. For one app, researchers observed that shortening the tail timer by three seconds reduced resource usage by 40% but increased state promotions by 31%.

It’s because of the tail time that transmitting even small amounts of data consumes significant battery power.

Each network decides on the tradeoff it wants to make: a long tail time that consumes battery power longer but enables the device to respond quicker to user input, or a shorter tail time to conserve battery power but at the cost of slower response times. Fast response or resource conservation? Limited resources often force such tradeoffs.

(Some device makers provide a mechanism, fast dormancy, for cutting short the long tail time to free up resources more quickly. See sidebar.)


A high-latency environment

State promotions are only one source of latency in the cellular network. Even after the device is connected, there are other latencies associated with the round trips needed to request and receive data from remote servers.

These round-trip latencies do exist in the wired environment also, but they are much shorter there, around 25ms (assuming the roundtrip occurs entirely within the continental US) versus 100-200ms in the cellular environment due to the cellular network’s hierarchical architecture.

A single download can entail many possible different round trips, each with its own latency. There’s the initial latency to allocate resources (unless the device is already at full power). Then begins a series of round trips (each 100-200 ms) beginning with transport connections (usually using TCP but not always), DNS lookups to retrieve IP addresses of remote servers; a series of HTTP requests (usually executed sequentially) to retrieve webpage objects (some news apps often download up to 150 objects). The sidebar provides an illustrated and simplified view of the entire exchange.

If developers don’t know about the state machine . . .  it’s hard for them to understand the source of inefficiencies in their apps and harder still to fix them.

Some roundtrip latencies can be reduced or eliminated by opening persistent connections (where one TCP or other transport connection is used to retrieve several objects) or by pipelining HTTP requests, though pipelining is not often used due to a variety of reasons.


HTTP pipelining parallelizes HTTP requests in a single TCP connection to reduce the delays that degrade customer experience.


Lost packets—common on wireless networks, especially as mobile calls are routinely handed off among cell towers—add even more delay since the radio network needs to implement packet recovery. And when packets aren't recovered, more delay is introduced since TCP assumes that congestion is the sole cause of packet loss, and reacts by slowing things down: reducing the transmission rate and lengthening retransmission timers.


Impact of interactions between state machine and apps

The interaction of the state machine with apps and the network obviously plays an important role in app performance and energy efficiency. But the state machine itself is not obvious; like other implementation details, it is normally hidden by the APIs commonly used by developers.

If developers don’t know about the state machine (later conversations confirmed that they hadn't), it’s hard for them to understand the source of inefficiencies in their apps and harder still to fix them. Unaware of how to adapt apps for the specific characteristics of the cellular network, many developers continue to write apps the same way they’ve always written software. For many, this means writing software as they did for the wired environment.

But certain data transmission methods common in the wired environment, such as sending a series of small bursts or streaming video or audio at a constant, smooth rate only exacerbate the latencies and power constraints of the cellular network.

Discovering triggering factors is crucial for understanding and fixing the root cause of inefficient resource utilization.

Especially inefficient is sending data as a series of small bursts (a burst is defined as a complete data transfer of any size). If small bursts are spaced out at regular intervals, the device must constantly ramp up and then down, not only draining the battery but introducing a two-second delay for each new connection. Large bursts are more efficient since they entail fewer promotions and fewer demotions.  


Because of the long tail time that holds the device in a power mode, transmitting data as a series of small data bursts takes much longer and requires much more battery power than it does to send a single bigger burst.


Making the state machine visible

To make the workings of the state machine transparent and thus help developers see what was going on in their apps, researchers created an app profiling technology capable of pinpointing inefficient resource usage in apps. Essentially the technology looks at all cross-layer interactions occurring from the application layer down to the radio resource state and identifying the events happening at each layer.

The profiler works like this: a collector runs directly on the device along with the app being analyzed, collecting traces and user actions. Packets from all apps are captured and then downloaded onto the PC running an analyzer, which separates out packets of different apps. An app can thus be analyzed separately, or in combination with other apps.

The analyzer performs a series of analysis at each layer of the protocol stack—RRC state machine, TCP, and HTTP—to infer the state, identify data packets from control and other special packets, map packets to HTTP requests and responses.

From these analyses, researchers are able to tell what state the machine was in at any point, see what events are occurring concurrently, differentiate large (and resource-efficient) bursts from small ones, and most importantly, associate each data transfer with the event—user input, signaling exchanges, server or network delay, TCP handshakes, packet loss, periodic updates—that precipitated it. Discovering triggering factors is crucial for understanding and fixing the root cause of inefficient resource utilization.

(Specific information about how the technology works is detailed in a paper to be presented at MobiSys this June).

Once the initial problem was diagnosed—a scatter-burst transmission of small packets—the solution was obvious: bundle the packets into a single transmission.

The analyzed information is presented as visualizations showing the sequence of states and state transitions, aligned with the triggering events.


Visualizations of the analysis clearly show transitions and their frequency and how long the device is in full-power mode.


Being able to "see" state transitions opens a window directly onto the state machine; aligning transitions with bursts reveals the source of the particular problem. Presented in this form, the analyses make it easy for developers and others to understand exactly what is going on, making it possible to quantify energy and radio network efficiency, pinpoint inefficiencies, and more importantly, identify solutions to fix inefficiencies.

The profiling technology also enables developers to quickly try out different what-if data transmission and packet timing scenarios to get the best performance, without having to go through the costly development cycles of creating new versions of their apps.

Another possible use is contributing metrics for a rating system that would evaluate apps based on energy efficiency, and enable users to consider energy efficiency when choosing apps.


Case study: PANDORA® internet radio

One of the first Android apps to be analyzed using the new energy profiling technology was the popular PANDORA® internet radio, a source of significant amount of traffic on cellular networks.

In one instance, researchers ran the app profiler with PANDORA® for 12 minutes. The resulting visualization showed a series of short bursts once every 62.5 seconds, each proceeded by an energy-expensive promotion and followed by a wasteful tail time.



Visualizing PANDORA® transmissions reveals a constant cycle of connections and reconnections for small data transfers.


While the music itself was sent simply and efficiently as a single file, the periodic audience measurements—each constituting only 2KBs or so—were being transmitted at regular 62.5-second intervals. The constant cycle of ramping up to full power (2 seconds to ramp up, 1 second to download 2KB) and back to idle (17 seconds for the two tail times, the first down from full-power mode and the second down from half-power mode) was extremely wasteful. Of the total amount of device battery energy consumed, 46% was expended on these periodic measurements, less than 0.2% of the data transmitted.

Once the initial problem was diagnosed—a scatter-burst transmission of small packets—the solution was obvious: bundle the packets into a single transmission. Doing so would result in an easy 40% energy savings.

Perhaps the most important information gleaned from the new technology is how easy it is to fix the inefficiencies.

Researchers also analyzed popular news apps and found a wide variability in how data, and how much, was transmitted. They were particularly interested in two metrics: a high promotion ratio (time spent in transitions vs entire transaction time) and a high tail ratio (percentage of time spent in the tail time vs total time in full- or half-power mode), which was wasteful both of battery energy and network resources.


Case study: The costs of a high tail ratio

Researchers looked particularly at a news app with a long tail time. For this app, researchers found that a little more than half the time (57%) spent in full or half power was actually spent in the tail. Close to 40% of battery energy was being used to hold open an empty channel.

This visualization shows the inefficiency of small data transfers, where more time is spent in the tail time than in transferring the object.


The problem was found in the HTTP level analysis, which showed small objects being transferred whenever the user scrolled, requiring constant promotions to full power. As one example, transmitting the small (5KB or so) headline thumbnails was alone eating up 15% to 18% of total power consumed, a costly inefficiency easily avoided by prefeteching all thumbnails in one batch.

Researchers looked at other apps and found other inefficiencies: apps that repeatedly transferred the same image at every refresh, apps that periodically connected to the same IP address, and apps that smoothly streamed video and audio at a low bit rate using only a portion of the available bandwidth.


Prefetching a little or a lot?

Some news apps had very good tail and promotion ratios, often due to aggressive prefetching. (In prefetching, content is downloaded and stored locally where it's immediately available without a network connection, or the wait for one.) News apps prefetch by differing amounts; some download the entire edition, while others download a page at a time.

A lot of prefetching improves the user experience since pages come up quickly. Apps that do little prefetching often need to connect frequently to download new items; the associated delays slow response times, and the constant promotions use up battery power.

But, again, there are tradeoffs that prevent making blanket recommendations for all cases. Sometimes prefetching makes sense, other times it does not. While prefetching makes for a better user experience, the initial delay is longer (close to two minutes for one app), and it potentially wastes bandwidth and battery power if content is downloaded and not consumed, such as when users want only to scan headlines. And prefetching costs more, since mobility customers pay more for more bandwidth.


The next steps

AT&T Research is starting to share its analysis with developers of popular apps, including PANDORA® and Facebook®; so far, the reception has been enthusiastic as developers have been quick to grasp the concept and understand how the analysis can help them optimize their apps.

AT&T's analysis of the Pandora application gave us a much better view of how Pandora interacts with low-level cellular network resources. Now that we better understand these interactions, we can optimize our application to make more efficient use of these resources. In fact, we'd like to incorporate AT&T's profiling tool as part of our normal ongoing testing.

-Tom Conrad, CTO of PANDORA

Perhaps the most important information delivered by the app profiling technology is how easy it is to fix the inefficiencies. Many solutions—bundling packets, HTTP pipelining, persistent connections, timing the delivery of packets—are already part of the standard techniques available to developers and require no new algorithms to implement, no new protocols, no new data transmission techniques. The difficulty until now has been identifying the specific problem. The app profiler now fills that role.

The other good news is that by making apps more energy efficient and more cellular-network friendly—that is, by specifically designing apps to minimize the time spent in full-power mode and thinking carefully about the tradeoffs often required—not only do apps use less battery power, they respond faster to user input. They become better performing apps. This is good for users, who get a better experience and longer battery life. It’s good for network providers whose resources are utilized more efficiently. And it’s good for developers because their customers are happier.

It’s a win-win-win situation, with no tradeoffs.

Update: AT&T ARO (Application Resource Optimizer) is now available for downloading (go here).   


What is an app?

Apps are essentially software programs adapted for small mobile devices. Some apps you buy, some are free, and others are automatically downloaded.

Among the most popular apps are those for streaming music or video or for viewing news sites. They include  PANDORA®, and news apps for large media sites (such as the BBC, Fox News, New York Times, and others).

Apps connect to the network to download a content page along with instructions on what other objects—files, graphics, videos—need to be downloaded and from where. The objects are then downloaded over transport connections (usually TCP).


What is an energy-efficient app?

An energy-efficient app is one adapted for the characteristics of the cellular network, specifically one designed to transmit data in a way that both limits the amount of time connected to the network (to conserve limited battery and cellular resources) and to minimize the number of connections required (thus reducing the latencies entailed by setting up network connections).




 What is fast dormancy?

The fast dormancy feature allows an app to send a control message to unilaterally and immediately demote the device to a lower power mode, without waiting for inactivity timers to expire. This conserves battery power and makes network resources available sooner to other network users.

But fast dormancy is difficult to use. Predicting the end of a connection is easier on certain types of apps (an online banking transaction, a file transfer) than others (random web browsing).

Used too aggressively, fast dormancy can actually worsen matters, since dropping to a lower power mode may affect other concurrently running apps that would almost immediately require the device to be promoted.

To avoid this, AT&T researchers (again in collaboration with the University of Michigan colleagues) are proposing Tail Optimization Protocol (TOP), which implements fast dormancy using a simple interface that coordinates among currently running apps before demoting the device.

For more information about TOP, see the paper TOP: Tail Optimization Protocol for Cellular Radio Resource Allocation


Best practices for app developers

Performance problems in apps are due to a variety of causes. While identifying the specific causes in a single app requires completely analyzing the app (for example, using AT&T’s energy profiling technology), it’s possible to make some broad recommendations to both improve performance and reduce energy consumption.

The single most important measure: transmit as much data as possible in a single burst and then end the connection.

Files for example should be downloaded all at once, at full capacity. This both improves user response times (since there are no delays to set up multiple network connections) and conserves battery power.

The exception is for large video files, which should be downloaded in segments at regular intervals (every 2 to 5 minutes). Then if the user stops viewing the video, the transmission can be ended without any further downloads or use of bandwidth.

Other, more specific practices—uploading only those objects that have changed, synchronizing the app to poll at the same time as other apps—will also help depending on what the exact problem is.

To help developers, AT&T is setting up a best practices website that will give more specific guidance on how to increase the energy efficiency (and thus the performance) of apps. This website will be available later this spring at


About the researchers

Alexandre Gerber is a principal member of the technical staff at AT&T Labs – Research. His research interests include Internet routing, network data mining, and IP traffic measurements across environments ranging from enterprise virtual private networks (VPNs) to consumer wireless and wireline broadband Internet networks. Gerber has an MS in telecommunications engineering from Telecom Bretagne. He is a member of the ACM.

Subhabrata Sen is a principal member of the technical staff at AT&T Labs – Research. His research interests include IP network management, application and network performance, network data mining, network measurements, configuration management, traffic analysis, and wireless networks. Sen has a PhD in computer science from the University of Massachusetts, Amherst. He is a member of the IEEE and the ACM.

Oliver Spatscheck is a lead member of the technical staff at AT&T Labs – Research. His research interests include content distribution, network measurement, and cross layer network optimizations. Spatscheck has a PhD in computer science from the University of Arizona, Tucson. He is a member of ACM, IEEE, and USENIX.

Feng Qian is a PhD student in EECS department at University of Michigan, advised by Professor Z. Morley Mao. He obtained his Masters degree in Computer Science and Engineering from University of Michigan in 2009, and his Bachelor degree in Computer Science (ACM Honored Class) from Shanghai Jiao Tong University, China in 2007. His research focuses on network traffic analysis and mobile networks. He has published several papers in related conference proceedings including Internet Measurement Conference (IMC), IEEE International Conference on Network Protocols (ICNP), and International Conference on Mobile Systems, Applications, and Services (MobiSys).

Zhaoguang Wang is a PhD student in Computer Science and Engineering at the University of Michigan. He obtained his Bachelor degree on Software Engineering at Shanghai Jiao Tong University, China in 2009. He works on mobile networks and smartphone related topics under the supervision of Prof. Z. Morley Mao. He has been coauthor on several publications in proceedings of Internet Measurement Conference (IMC), IEEE International Conference on Network Protocols (ICNP), and International Conference on Mobile Systems, Applications, and Services (MobiSys).

Z. Morley Mao is an associate professor in the Department of Electrical Engineering and Computer Science at University of Michigan. She received her BS, MS, and PhD degree all from University of California at Berkeley. She is a recipient of the NSF CAREER Award, Sloan Fellowship, and the IBM Faculty Partnership Award. She has been named the Morris Wellman Faculty Development Professor. Her research interests encompass network systems, routing protocols, mobile and distributed systems, and network security.