Charles Barry[1]
Independent Cyber Security Consult and Senior Fellow, Center for Technology
National Security Policy, National Defense University, Washington
Executive Summary [2]
This study analyzes the resilience and vulnerabilities of large organizations’ connections across the Internet. It is informed by an embedded inquiry into the dependencies of public organizations and private enterprises whose operations rely heavily on the Internet. Snapshots of both the logical topographies and physical structures of the Internet are readily available. However, logical topographies of information flows are always dynamic and visual graphics depict only a particular nanosecond of real connectivity. The physical backbone of links and nodes over which data is transmitted is less dynamic, yet it too is always changing as new structure is added and business relationships change among providers. These logical and physical maps are interdependent.
Autonomous Systems (ASes) are the primary building blocks of Internet connectivity, and they comprise the network level with the most useful information for a detailed analysis. Internet traffic paths are determined by the routing logic (policies) employed by the various AS networks involved in transiting any particular data, and by the mesh of physical links available between a given sender and its recipient. The conclusions and recommendations listed in the following Overview section of this paper are based on this AS level analysis.
Commercial services and academic research offer broad insights into the resiliency of select organizations’ Internet connectivity. Some analyses show the resiliency of prominent organizations is not as robust as the best network industry entities. However, more analysis is needed in order to define any useful vulnerability information and offer specific remedies. Rigorous research can point out the need for greater redundancy in physical connections, the advisability of customer-tailored routing policies, contractual arrangement with multiply service providers, or even international initiatives to strengthen areas of the physical infrastructure at high risk of single point failures.
From the result of such analysis two further lines of research should unfold. The first should be based on modeling and simulation to better understand the way the logical topography of the Internet responds to changes in its physical structure. This could be done using existing models and simulations as well as designing new ones. A second line of research should engage external academic and commercial enterprises already conducting analysis of transit traffic metadata to understand what other information should be considered to produce a high fidelity picture of organizational dependencies, and thus vulnerability to network disruptions. Both these initiatives would yield greater understanding of potential remedies.
Overview
Most all organizations depend on the Internet, for their very sensitive communications and information requirements as well as for the conduct of the day-to-day business operations. Connections to the Internet
are thus mission critical. Reliability demands diversity of access and redundant, secure upstream topology across the entire Internet infrastructure.
The logical and physical ‘maps’ of the Internet are interdependent and both are highly dynamic. Routing policies employ algorithms based on conditions such as volume, shortest path to recipient, transit pricing, network availability and peering relationships. Routing policies are routinely revised by providers without notice.
Physical routing choices can change abruptly based on the realities of equipment failure, latency and speed across undersea or overland cables, and congestion at Internet exchange points. The logical unavailability of particular links and nodes is a key factor in physical routing. The focus herein is on the AS level of the Internet. The ASes of interest are the Internet’s thousands of transit service providers (ISPs).
These are the on ramps as well as the nodes of the Internet, and they are where routing policies are established that govern connections within and between the many tiered networks that make up the Internet. The diversity and redundancy of these connections establish a picture of either redundant reliability or multiple vulnerabilities to potential disruption. The physical paths of the Internet also are most manifest at the AS level. The basic building blocks are the cables and Internet exchange points (IXPs) where ISPs exchange traffic between networks. IXPs are often where major overland and undersea cables connect. Outages for any reason across these nodes and links represent the physical side of potential vulnerabilities of data during transit.
Internet traffic is monitored and mapped in detail by research centers and Internet data tracking services. Enterprises such as specialize in precise situational awareness of route congestion, equipment outages, and disruptive events like natural disasters and malware attacks. Tracking information can be tailored for commercial use, especially in the financial and energy sectors, or for government departments like ministries of foreign affairs and defense.
Maps of logical traffic flow information are depicted as either streaming graphics or snapshots, usually in milliseconds. They note traffic volume as normal or trending, and depict disruptive events that impede traffic flows. For example, Internet traffic monitoring service Renesys (since acquired by Dyn Corp) posted several time-sequence graphics depicting Internet traffic flow interruptions in Thailand during recent protests there [3].
Much of the Internet’s physical infrastructure is reliable and redundant, though not always secure. An initial examination suggests that path diversity in many areas of interest might be improved (appendices 2 and 3 address vulnerabilities in undersea cables and Internet Service Providers — ISPs). However, many other factors come into play in choosing paths, including politics, business agreements, transit pricing, and router configuration.
The research reported here examines several combinations of physical infrastructures and logical topologies. A useful metric looks at the Tier level (1, 2 or 3) of immediate access providers and the number of different providers available to an organization, to produce a consolidated picture of transit diversity and resilience in the event of service disruptions. Recommendations include company policy changes and negotiating better service agreements to increase the resilience of Internet-dependent networks worldwide.
Logical topologies provide indications of potential points of vulnerability due to common factors such as high traffic volume. Logical topologies can be displayed in many ways. Nodes can be shown by content size, or their number of connections. Link thickness can reflect the normal amount of traffic carried. Maps can show the routes that packets take as a snapshot in time. These visualizations can be very interesting, even dramatic, but none that were examined offered useful insight into sustained network resilience.
The most common causes of physical disruptions are still accidental damage, equipment failure or criminal activity. Politically motivated attacks on Internet infrastructures are far less frequent, but have been increasing. Up-to-date measures of transit dependency, access redundancy and system vulnerability can help any Internet-dependent organization assess where to devote resources to improve reliability.
This research does not address all risks inherent in dependency on the Internet. The AS level does not include vulnerabilities that can be present at the sub-network level within each AS. The employment of routers within sub-networks (sometimes thousands of routers) may represent other single points of failure that are not visible at the AS level itself. Future research may discover other logical or physical vulnerabilities, and more work needs to be done to understand optimum connection resiliency. For very large organizations with global business interests, it may be necessary to disaggregate and prioritize the dependencies of each of their sub-entities in order to concentrate on the most critical vulnerabilities in more detail, e.g. by geographic sub-region,
and prioritized sites within these areas.
Conclusions, Recommendations and Further Analysis
Robust access resiliency helps ensure continuous connectivity and guards against disruptions. Snapshots of networks at a single point in-time or single point-of-presence are not enough. Organizations must have continuous awareness of the health of their immediate connections to the Internet as well as the systemic reliability of relevant upstream ASes. Such situational awareness has to extend beyond an organization’s most common transit paths to include an understanding of the overall physical Internet infrastructure and logical topologies involved in routing traffic across networks from origin to destination, including an appreciation of potential vulnerabilities along logical paths, which may stem from many sources. Determining adequate connectivity requires standards, which are always evolving and may not be fully defined. Resiliency may also require new initiatives such as public-private partnerships, whole-of-government engagement, and bi-lateral diplomacy.
Recommendations
• Identify ASes critical to an organization’s strategic and operational connectivity.
• Develop a high fidelity methodology to monitor Internet dependencies continuously.
• Work with governments and private sector providers in the United States and overseas to enhance at least the critical network connectivity and security.
• Identify international initiatives to encourage host nations to address infrastructure weaknesses and vulnerabilities.
• Examine Internet service contracts and routing policies to reduce vulnerability and enhance ability to re-route network traffic rapidly when/where disruptions occur.
• With regard to cloud service providers, ensure back up data and systems are not stored in the same cloud.
• Be aware of global crises that might require a company’s traffic to be re-routed through countries where it is not subject to disruption.
• Areas for Further Analysis
• Examine methodologies for measuring the resilience of Internet connectivity to determine best of breed metrics for continuous monitoring.
• Analyze the connectivity of major sub-agencies, activities, and critical sites and compare their resilience to an acceptable requirements benchmark.
• Develop a methodology to determine the ASes, countries and regions where a global organization has the greatest vulnerability/least resiliency.
• Combine these into a more granular analysis of resilience by region and function.
• Develop a continuous resiliency monitoring system, e.g., scorecards, for each command, agency, field activity, critical site worldwide.
• Identify globally the critical Internet Exchange Points (IXPs), undersea and overland cables, cable landing points, large data centers and satellite downlinks that are vulnerable to physical disruption and establish risk profiles for each.
General
The Internet can be thought of as the global Information and Communications Technology (ICT) infrastructure — the interconnected, networked mesh of routers, switches and other purpose-designed nodes that connect computers and other information systems. The two key components of the ICT infrastructure are its global physical architecture and its logical topology.
The physical infrastructure is comprised of tangible structures and hardware located on national territory or in the undersea or space global commons. The majority of this physical infrastructure is operated by private entities. The logical topology of the Internet is determined by the routing policies (software protocols or rule sets) established by each of the administrators of the thousands of separate domains (networks) called Autonomous Systems (ASes) [4], in order to control the flow rate and path of information across and among their respective ASes.
This study concentrates on logical Internet mapping at the Autonomous Systems (AS) level based on work done by companies like Dyn (formerly Renesys) and the Network Science Research Group at the University of Louisiana — Lafayette. AS layer modeling seeks to generate a topological picture that is more detailed than the network layer, yet more aggregated than would be presented by a sub-network or router level analysis. All modeling levels have value, and each offers complementary perspectives on route topology. Ultimately, additional models, other than AS-level descriptions, would be necessary to understand Internet dependency and the diversity of access for various sub-organizations. For example, a full determination of network resilience requires analysis of downstream sub-networks and routers as well as upstream Internet exchange point (IXP) connections. This study is but a start.
The Internet is not a fixed ‘network of networks.’ The number and size of AS-defined networks is constantly changing. More than 45,800 ASes (as of 12 December 2013 [5]) currently make up the Internet Protocol
Version 4 routing system. These operate through 300 or more large Internet Exchange Points (IXPs) [6] and smaller connection nodes, and over hundreds of thousands of miles of undersea and terrestrial fiber optic cables, plus space and microwave links.
ASes are often Internet Service Providers (ISP)[7] providing data transit services between one host and another. ISPs are further described as Tier 1, 2 or 3 service providers. Tier 1 ISPs are those that can reach every part of the Internet by themselves or through peering, not by purchasing transit or paying peer settlement fees to other ISPs. In the United States the top three Tier 1 providers are Level 3 Communications, Verizon Business and AT&T [8]. Tier 1 providers in other regions include NTT Communications (Japan), China Telecom, Tata (India), Deutsche Telekom AG (Germany), TeliaSonera (Europe), and Seabone (Italy) [9]. These peer-to-peer ASes typically have the greatest capacity, the most current technology and the strongest resiliency. They are often referred to informally as ‘the Internet backbone’ [10]. Tier 2 providers both peer with some other ISPs and buy transit rights across parts of the Internet from other, usually Tier 1, ISPs. Tier 3 ISPs only provide transit to the Internet via agreements they purchase from other ISPs. This third tier is the typical access link for individuals and small businesses. Large organizations can and should link directly to Tier 1 or Tier 2 ASes (ISPs) to increase service quality as well as reach and reliability.
The ever-changing AS composition of the Internet makes it hard to present an accurate, current map of its structure in a traditional sense. Through the use of traceroute and tracepath software, as well as other metrics, a number of commercial companies can provide minute by minute mapping, especially where superimposed on a familiar geographic reference such as a given country’s borders. Providers within a country or region can be identified, with the largest service providers being relatively fixed in terms of their Tier 1 or Tier 2 status. Each ISP’s market share or ‘reach,’ is determined by the percent of Internet Protocol (IP) addresses they service via downstream providers.
The Physical Infrastructure of the Internet
The physical infrastructure of the internet is made up of the AS domain networks—Tier 1, Tier 2 and Tier 3 as described above. In large IXPs as many as 50 or 60 ASes maintain links to each other in the same building. There they can connect to multiple ASes or to undersea cable providers to pass traffic. IXPs like 60 Hudson Street and 111 Eighth Avenue in New York or 530 6th Street in Los Angeles are some of the biggest, but these types of nodes exist globally, with more than 250 being accessible for analysis via interactive maps. If transit traffic were to be interrupted at any one of these sites it would require alternate routing. In some places, e.g. the United States, alternate routing may be readily established but it could be much more difficult elsewhere, e.g., in Central Asia. In any case, some IXPs would be attractive targets should someone want to disrupt Internet traffic [11], and many are not well protected against physical attacks.
Approximately 95% of Internet traffic transits via fiber optic links, either terrestrial or undersea cable. The remaining 5% (or less) is transmitted via satellite, microwave lines, free space optics and other niche means. Fiber optic cables continue to be far more efficient than satellites in terms of operating cost, traffic capacity, signal quality, and reliability [12]. However satellites provide valuable service alternatives in contingencies operations and high priority transmissions in remote areas. Since 2011 a new generation of satellites has been improving Internet speeds and reliability at lower cost than previous satellite systems [13]. This trend is expected to gain momentum.
A third essential component of the physical infrastructure, although an indirect one, is the electrical power grid, since both nodes and links require electricity [14]. Power grids, though not dedicated exclusively to the Internet, must be considered when assessing vulnerabilities. ICT and power infrastructures go hand-in-hand and are two of the most economically critical infrastructures in any country.
The Internet’s Logical Topology
The Internet’s logical topology is determined by its routing architecture, which is based on the policies, protocols and rule sets put in place and maintained by ASs. The most prominent Internet protocol suite is a binary protocol called TCP/IP (Transmission Control Protocol/Internet Protocol). TCP/IP determines the logic of how data should be formatted, addressed, transmitted, routed and received. It is the fundamental protocol for defining the end-to-end connectivity logic across the Internet’s system of networks.
Another core protocol is the Border Gateway Protocol (BGP), which ASes use in establishing their routing policies with each other’s domains. TCP/IP, BGP and many other protocols define the logical Internet. ASes also use Interior Gateway Protocols (IGP) for internal routing among their IP sub-networks. Importantly, ASs have an understandable preference for routing traffic via their own sub-networks, sister networks or peer networks with which they have business routing agreements. This reality can mean taking longer routes than expected in some traffic situations. For simplicity of analysis, we will regard ASes as single entities within the routing logic of the Internet as a whole.
Among the most significant applications of the logical Internet are the operating systems of the physical links and nodes, the software utilized by the various Tiers of ISPs; and the Domain Name System (DNS) software. Like the protocols they utilize, these systems affect how traffic flows on the Internet.
The army of network administrators who set distinct routing policies for their individual AS also have considerable influence of the logical traffic flow of their customers. There are now some 72,000 allocated Autonomous System Numbers (ASNs) [15], each with a distinct set of external routing policies. These policies determine normal routing flows as well as how traffic flows will be re-directed in the event of disruptions or in emergencies. For example, ISPs can prioritize traffic of their largest clients first in terms of bandwidth availability, transit speeds and repairing outages.
While physical maps of networks are usually available via open sources to inform users of potential risks of degraded service, administrators of private sector networks typically consider their system’s actual (versus initial) routing policies, and thus much of the granularity of the Internet’s logical topology, to be proprietary.
Internet Vulnerabilities
This is not a paper on cyber security, but some background on potential vulnerabilities and types of physical and logical disruptions may be useful. Physical and logical disruptions can result from natural or manmade causes, accidental or intentional, as well as from electro-mechanical failures (including system overload). Physical disruptions can cause logical disruption and vice versa: physical damage to a network component can alter the logical flow which may affect Internet latency and bandwidth (throughput). In the run-up to military actions, cutting, jamming or otherwise disrupting (at least temporarily) key communications and information nodes can be expected. Aside from the infrastructure destruction in New York in conjunction with 9/11, terrorist attacks thus far have not caused significant disruption of Internet service (as opposed to damaging targeted facilities like the Aramco refineries in Saudi Arabia); but this has to be considered a future possibility.
Disruptions to the logical topology through malware may disrupt/degrade the physical Internet, such as a Distributed Denial of Service (DDoS) event that disrupts server capability. Point of vulnerability for the logical internet include: (1) manipulation of TCP/IP software, (2) attacks directed at ISPs and (3) attacks on the DNS itself. More information is in Appendix 3.
A key point is that a truly global strategy for reducing vulnerabilities can’t primarily be technical; it must be an integrated public-private, whole of organization and government, trans-national approach mounted in parallel across people, processes, organizations and technology.
Dependency Analysis
Measuring Resilience to Internet Disruptions
Measuring an organization’s resilience to Internet disruption requires an understanding of the alternative means of information transfer that could be used should a particular sub-set of the Internet not be available. This is important since most of any large organization’s the information, including proprietary data exchanges flows over public networks, or more accurately, is encapsulated, encrypted and tunneled [16] across the Internet infrastructure. An analysis of reliance on the Internet needs to include: (1) the criticality of a subnet, (2) the host country profile, (3) the diversity of available connections.
Criticality: What parts and regions of the Internet are most critical to the organizations and its reliance on some part of that infrastructure and logical topology constitute an unacceptable risk? It will be important to identify on which ASes the organization relies, how many of them are owned by a single provider, where the key nodes and links are, and what routing policies they have adopted. Priorities must be assigned among, for example, ASes serving command headquarters and those supporting non-front line units in remote locations. Analysts must also understand the redundancies that are available. Ultimately, similar metrics will be needed for interagency partners and allies deemed critical to national security.
Host country profiles: Organizations also needs to know the Internet infrastructure profile of the host countries in which they operate or plan to operation in the future. The number of ASNs assigned by international agreement to a country, usually to ISPs, is one recognized measure of Internet Infrastructure robustness. For example, Turkey reportedly has 290 ASNs while Azerbaijan has only 31. Other countries of note include Afghanistan with only 20 and Turkmenistan with only 2. Syria is also thought to only have two ASNs while Yemen has but one. These numbers indicate countries at risk of being disrupted significantly by natural or manmade disasters. Other countries that have little or no host nation-based Internet access include Bangladesh, Turkmenistan, Uzbekistan, Libya, North Korea and Cuba [17].
Transit Diversity: How much diversity does any particular suborganizational entity have in reaching the Internet’s top level ISPs? When an organization’s primary provider, node or link is degraded, are alternate providers and conduits already connected, ready and able to maintain presence seamlessly? This is a deeper question than just the number of ASes directly servicing each critical network of any multinational corporation or global organization. The answer has to include analysis of upstream providers and the routing architecture all the way up to the Internet backbone in order to see if multiple ASes converge at some single point of vulnerability. ISP business relationships are constantly in flux and with them the routing architectures of the affected ASes. An effective analytical process has to include continuous communication with service providers and periodic routing updates.
Resilience is a function of transit diversity that is achieved not only by using multiple direct AS providers, but also by ensuring multiple paths among upstream indirect providers all the way to the Internet backbone. In addition to assessing direct and indirect provider relationships for each of its principal subordinate entities, every global organizational has to be aware of the business relationships of their Internet service providers. Do some or all providers pass through the same parent AS? Are critical ASes vulnerable to de-peering actions that could make it difficult for them to connect with customers or collaborators? If so, an organization-wide disruption could generate perturbations across all ASes servicing one or many connections.
A recent Dyn/Renesys analysis looked at routing data at the AS level to assess direct and indirect diversity by studying relationships among providers. They then scored organizations on their degree of transit diversity, from none (a score of zero) to rich diversity with no single points of failure (a score of 100). The scatter plot in Chart A below represents scores for tens of thousands of organizations with Internet presence, grouped by the number of ASes employed (vertical axis). Notably, the United States Department of Defense (DoD), just one of many organizations in the analysis, scored low at 34, potentially indicating significant potential for improving the transit diversity of its Internet presence [18]. The low DoD score likely reflects a large, dispersed organization with a multitude of individual, locally contracted transit providers. In addition, a DoD-wide scoring result at any point on the scale does not tell us much as there is no indication of where transit diversity is low and where it might be sufficient. Are major commands vulnerable to disruption? There is no breakdown among the multitude of subordinate commands or agencies dealing in such functions as logistics or personnel. The Renesys algorithms are proprietary, but more detailed analysis of the data set is possible and would likely be useful.
Chart A. A Renesys Transit Diversity scorecard
A Dual Direct — Indirect Dependency: The Electric Power Grid
Internet dependence related to electric power grids is both direct and indirect. Most organizations with critical networks maintain backup or emergency power supplies such as generators to compensate for power outages, however some are investigating dual-sourced electrical feeds for key installations. This is important since emergency fuel can run out or back-up components fail. Most all critical Internet infrastructures are operated by the private sector. Private sector infrastructure continuity of operations (COOP) mechanisms vary in quality, but most do have backup power capacity, albeit for limited periods. Thus, widespread, long duration power outages that affect multiple service providers could degrade Internet performance for many users. Disruptions of power and Internet service also may make it impossible for consumers to pay for critical commodities like fuel and medicine, with potential consequences for social stability [19].
Geopolitical Influences on the Physical/Logical Internet
In addition to physical and logical vulnerabilities that could hinder an organization’s ability to use the Internet globally, providers abroad may not be paying enough attention to infrastructure modernization or security. A related concern is that unrest in countries with weak governance or unfriendly national governments could block, or distort, Internet transits within their borders through legal maneuvers or other actions.
Some states regulate Internet content and control telecommunications through government-owned companies, or own the physical infrastructure required to connect to the Internet. Authoritative regimes have also acted to suppress Internet access when a crisis arises. During the January 2011 protests in Egypt the government invoked a national security provision in its telecommunications law to direct foreign wireless carriers to disable access to mobile networks.
Most did initially, though later some restored limited services [20]. The gambit had little or no effect on the revolt as protesters rallied to work around the restrictions [21]. However, the action demonstrated the degree to which some states will seek to limit access to unfiltered information and the Internet.
It should be assumed that in authoritarian states, the private (for profit) sector, the citizenry, and third parties (international Internet operators and users) will have little or no recourse to legal appeal or regulatory against state actions to interfere with the Internet. At the same time, poor physical or cyber security also could open holes that could disable or degrade Internet connectivity based on non-attributable actions from almost any geographical area.
Considerations for diplomatic, legal or commercial engagement to reduce vulnerabilities
Diplomatic, legal, or commercial engagement, at various levels, may be able to reduce vulnerabilities and broaden the redundancy of the Internet abroad. The most important step is to convince governments to treat the security and reliability of information and communications, and their related infrastructures (ICT), with a much higher priority than they now are. These capabilities are helping to determine winners and losers in economies, affecting international relations and changing the way our children think. Yet they rarely are treated as either essential services or critical infrastructures in interagency priorities.
These are issues for business and government leaders and policy makers, not just cyberspace technicians. Some opportunities are:
• Encourage countries that lack sufficient redundancy within their borders to increase routing diversification of at least their main conduits. Network resiliency could be added to the agenda of transit fee negotiations.
• Engage on plans for new or upgraded fiber optic cables overland or undersea, often backed by consortia of multinational companies and the countries to be served. Multinational consortia can be encouraged to include protective measures in their design, such as burying undersea cables, hardening overland connection points, installing the latest monitoring and investing in robust maintenance capacity.
• Appraise countries who host telco/carrier hotels, large data centers, cable landing points or satellite downlinks of potential threats as well as the ramifications of disruption. They might be offered risk assessment evaluations and recommendations to protect their sites better.
• Encourage international organizations and their affiliated agencies to deepen information sharing on the reliability and security of the Internet across primary links and at key sites. This cooperation might eventually extend to sharing threat information and response actions.
• Promote the establishment of legal processes to govern, or otherwise oversee, the actions of responsible commercial providers operating on national territory.
• Encourage all governments to increase access to the Internet has the spin-off effect of increasing demand for reliable services and thus investment in the latest, most capable technologies and management.
• Advocate in various fora that nations foster use of routing registries as a best business practice and national standard. Non-registry and out-of-date routing policies hamper trace back and analysis of vulnerabilities.
• Engage aggressively in international standards bodies to reduce support for Internet disruption or segregation.
Denied or Degraded Internet Access: A Manageable Risk
Given the robust Internet connections in regions such as the continental United States, between North America and Europe, inside Europe, and across the Pacific to key Asian allies, one would expect that the risk to transmissions in these areas would be low. In most current areas of interest failures of links would add latency, but it probably would take multiple failures before the increased latency becomes significant. Nevertheless, it is possible that the unintended routing of multiple fiber optic links through vulnerable tunnels or along bridges could lead to unpleasant surprises. Since redundancy and security are much less robust in other areas, subsequent research should look at
particular organizational routing in more detail, attempt to understand the scoring with more regional granularity, and suggest mitigation measures. More work should be done with the business community and academic research also may yield valuable insights. Power grid security also needs to be considered as noted above.
Figure 1. U.S. Internet Connectivity Map
Figure 1 depicts the U.S. Internet backbone and primary subnetworks. Major Internet peer-to-peer interconnect points are (East to West): New York-Washington, Chicago-Dallas-Houston, and Los Angeles-San Francisco [22]. However, the networks shown have heavy redundancy, especially on routes connecting New York, Washington, Atlanta, Chicago, South Florida, Kansas City, Denver, California and Seattle.
Connectivity is not dependent on one city or one network provider. Rather there are many possible paths to reroute around most disruptions. However, the highest speeds (the 10 Gbps OC-192c/STM94 Network) are available only from New York-Washington-Chicago. Still impressive speeds of 2.5 Gbps (on the OC48c/STM16 Network) are found coast to coast among peer-to-peer cities [23].
The redundancy and speeds found across the United States also are present on important transoceanic routes, such as from greater New York to the United Kingdom, which is served by 6 different undersea cables [24] plus several nearby routes to and from New England and France. Similar capacity is present between the U.S. and key Asian allies and trading partners.
Figure 2. North Atlantic Cable Graphic
This does not mean that there is no threat to information networks’ resilience and connectivity. A case in point can be seen in the greater concentration of undersea cables terminating together on the Euro-pean end of the transatlantic cable system in Figure 2, particularly in southwestern United Kingdom (Cornwall) and northeastern France (Normandy region). A well-coordinated and successful operation to disable many Internet nodes or pathways physically could result in major degradation of private sector information networks.
One scenario which could lead to a massive, wide area and rapid degradation of the Internet involves electromagnetic pulse (EMP). A high altitude nuclear burst over the central United States would have significant consequences, as would a high intensity geomagnetic storm, such as the 1859 Carrington Event25. Either could cut electrical power and communications connectivity over thousands of square miles.
The damage mechanisms are different: High altitude nuclear bursts generate very short duration pulses (so-called E1 pulses), while space weather events typically affect power grids through long duration E3 events that affect electrical components such as high voltage transformers. Although unlikely, these kind of Internet interruptions need to be considered in contingency plans, but they’re outside the scope of this study.
In sum, total denial of network connectivity through physical attacks or accident is unlikely, although a partial denial or degradation of connectivity and service could result from the disruption or destruction of several nodes or pathways. Traffic could be re-routed automatically where the “open shortest path first” protocol is employed26. However, this still could result in delayed data transmission and application failure due to latency or a bandwidth shortage. Organizations could see higher costs from the use of more expensive bandwidth and alternative systems, such as satellites. More importantly the degraded service also could have serious impacts on time-sensitive operations. For example, a serious Internet disruption during a natural disaster or military conflict in the Western Pacific could have multiple effects.
First, commercial, government and humanitarian relief organizations, including militaries could suffer immediate disruptions in information flows related to response operations, information gathering, and command, control and communications across the region. Second, the ability to sustain prolonged operations might be jeopardized if the U.S. Global Transportation Network and other logistical support systems were disrupted/degraded or thought to be unreliable. Finally, governments would have to consider the possibility that such disruptions could escalate or widen to other information networks, both military and civilian worldwide, to include critical infrastructure and financial systems. Threats to economic and social stability in the U.S. and elsewhere may be crucial second or third order effects.
Other regions, such as Africa and Central Asia have less-developed physical Internet infrastructure, but this is changing due to rapid infrastructure development intended both to support these areas and transit through them. In the event of relief operations in a less-developed region, the UN and other agencies probably would be able to meet operational needs by augmenting existing indigenous infrastructure with microwave, satellite and other capabilities, but the development of partnering nation capacity, and effective host nation institutions, through a mix of public and private means is likely to be more effective and less costly in the long run. It may be possible to exert leverage on states, or other actors, by supporting the build-up of an Internet capability, or causing doubts in decision makers’ minds about their ability to count on continued access in a crisis. This might one day be formed into a theory of Internet based suasion, dissuasion, or deterrence, but that is for a different line of research.
Conclusions, Recommendations and Further Analysis
Continuous connectivity across the Internet is essential to all business operations of large organizations, especially global entities. Robust access resiliency can help guard against disruptions. Single point snapshots aren’t enough. Such organizations must have continuous situational awareness of their immediate connectivity as well as the systemic connections of upstream ASes. Such situational awareness has to extend externally to include an understanding of the overall physical Internet infrastructure and logical topologies involved in routing traffic across networks from origin to destination, including an appreciation of potential vulnerabilities along logical paths, which may stem from many sources. Determining adequate connectivity requires establishing standards, which may not be fully defined. Each organization can begin to improve its connection resilience by ensuring all its components incorporate diversity into Internet access agreements, assessing processes, organizations and technology. Resiliency may also require new initiatives such as public-private partnerships, whole-of-government engagement, and bi-lateral diplomacy.
Recommendations
• Identify ASes critical to an organization’s strategic and operational connectivity.
• Develop a high fidelity methodology to monitor Internet dependencies continuously.
• Work with governments and private sector providers in the United States and overseas to enhance at least the critical network connectivity and security.
• Identify international initiatives to encourage host nations to address infrastructure weaknesses and vulnerabilities.
• Examine Internet service contracts and routing policies to reduce vulnerability and enhance ability to re-route network traffic rapidly when/where disruptions occur.
• With regard to Cloud service providers, ensure back up data and systems are not stored in the same cloud.
Areas for Further Analysis
• Examine methodologies for measuring the resilience of Internet connectivity to determine best of breed metrics for continuous monitoring.
• Analyze the connectivity of major sub-agencies, activities, and critical sites and compare their resilience to an acceptable requirements benchmark.
• Develop a methodology to determine the ASes, countries and regions where a global organization has the greatest vulnerability/least resiliency.
• Combine these into a more granular analysis of resilience by region and function.
• Develop a continuous resiliency monitoring system, e.g., scorecards, for each command, agency, field activity, critical site worldwide.
• Identify globally the critical Internet Exchange Points (IXPs), undersea and overland cables, cable landing points, large data centers and satellite downlinks that are vulnerable to physical disruption and establish risk profiles for each.
[1] Dr. Charles (Chuck) Barry consults in the private sector and at the National Defense University. The presentation here, including the conclusions and recommendations are his own, and do not represent any views, positions or official policies of NDU, the Department of Defense, or the U.S. Government.
[2] This paper is based a modification of unpublished research conducted between 2012–2014 at the National Defense University with Dr. Lin Wells, which itself built on earlier work by Mr. Terry Pudas and Mr. David Kay. All of us worked at the time in the Center for Technology and National Security Policy.
[3] See Doug Madory, Protest Leads to Outage in Thailand, 7:56 pm, 2 December 2013 at http://www.renesys.com/2013/12/protests-lead-outage-thailand/
[4] An Autonomous System (AS) is a set of routers under a single technical administration, using interior gateway protocols (IGP) and common metrics to determine how to route packets within the AS, and using an inter-AS border gateway protocol (BGP) to determine how to route packets to other ASs. See http://www.cisco.com/web/about/ac123/ac147/archived_issues/ipj_9-1/autonomous_system_numbers.html- (March 2006) accessed 1 June 2013. The site referenced takes its definition of Autonomous Systems from Internet Exchange Task Force (IETF) document RFC 4271, edited by Rehkter, Y., Li, T., Hares, S. and entitled “A Border Gateway Protocol 4,” Network Working Group/Standards Track, The Internet Society. January 2006. Page 4.
[5] See the Classless Inter-Domain Routing (CIDR) Report at http://www.cidr-report.org/as2.0/, accessed 25 September 2013
[6] This number of Internet exchange points is taken from Telegeography, Inc. See their current map at http://www.telegeography.com/telecom-resources/internet-exchange-map/index.html — accessed 12 December 2013.
[7] ISPs are business ASes providing connection services to the Internet. While all ISPs are ASes, not all ASes are ISPs. For example, the DoD Network Information Center has at least four ASes (AS 721, AS2706, AS27065 and AS27066). See http://bgp.he.net/AS721 — accessed 15 December 2013.
[8] Global Internet Geography, P 10. See http://www.telegeography.com/page_attachments/products/website/research-services/global-internet-geography/0003/1871/GIG_Executive_Summary.pdf — accessed 26 May 2013.
[9] See http://drpeering.net/FAQ/Who-are-the-Tier-1-ISPs.php — accessed 3 June 2013.
[10] Tyson, Jeff, How Internet Infrastructure Works, http://computer.howstuffworks.com/internet/basics/internet-infrastructure2.htm
[11] See Telegeography’s Internet Exchange Map at http://www.internetexchangemap.com/ — accessed 14 July 2013
[12] http://www.fiercetelecom.com/special-reports/submarine-cable-operators-huntnew-routes-counter-congestion-political-turm
[13] See February 2013 FCC report at http://www.fcc.gov/measuring-broadbandamerica/2013/February
[14] For example, one Internet node, the world’s largest data center at 350 East Cermak Ave in Chicago is the city’s second largest power consumer after O’Hare International Airport. Undersea cables typically require 3,000–4,000 watts of power to run the series of signal amplifiers along the cable’s length.
[15] See http://www.potaroo.net/tools/asn32/. This site indicates 71,678 ASNs have been allocated to Regional Internet Registries (RIRs) as of 2 June 2013. Not all of these have been assigned to ASes (there are approximately 42,000 ASes active in the routing system). Most ASNs — more than 63,000 — are the older 16 bit ASNs. An AS may control more than one ASN. Allocation of the newer 32-bit, more flexible ASNs began in 2011. There is no mechanism for “returning” ASNs when no longer used/needed. Therefore the number should be considered an approximation of distinct routing policies employed across the Internet, though the number cited still indicates a remarkably complex mapping exercise. The website indicated is updated daily.
[16] Tunneling or ‘port forwarding’ is the standard technique of many Internet users to protect data. There are a host of protocols and methods, such as Microsoft’s Point-to-Point Tunneling Protocol (PTPP) or secure shell (SSH protocol) tunneling, to encapsulate or encrypt payloads over untrusted networks. Tunneling sets up a vitual private network (VPN) for authorized users. See http://searchenterprisewan.techtarget.com/definition/tunneling — accessed 17 May 13
[17] The numbers cited are current as of January 2013. Autonomous System Numbers as assigned by the Internet Assigned Numbers Authority (IANA) for Border Gateway Protocol (BGP) routing use (BGP is the core protocol within the TCP/IP protocol). ASes in a country are usually ISPs and the number of ASNs allocated within a country is one measure of Internet infrastructure robustness. See http://www.renesys.com/tech/presentations/pdf/menog12-cowie.pdf — accessed 19 May 13.
[18] See http://www.renesys.com/2009/05/keeping-score/ — accessed 31 May 2013
[19] The inability of customers to pay for gasoline with credit cards after Hurricane Sandy is one example. A 2011 NDU study of severe space weather events found that prolonged power outages over wide areas would generate particular concern among the public over pharmaceutical distribution.
[20] See http://latimesblogs.latimes.com/babylonbeyond/2011/01/egypt-foreign-telecomms-stepping-in-to-connect-protesters-to-internet.html. France Telecom and Vodafone Group were two that restored service after a few days. Others relied on land lines and old dial up services to circumvent the wireless interruption.
[21] See http://www.aljazeera.com/news/middleeast/2011/01/201112515334871490.html, accessed 26 May 2013. Over the weekend of January 30–31 2011 collaboration between Google, Twitter and a voice messaging service just acquired by Google called SayNow led to an announcement of a toll free international phone service called “speak2tweet” where Egyptian users could leave a twitter message for others. The success of this effort in circumventing efforts to control social media led to a duplicate speak2tweet service in Syria, launched in December 2012 as that government threatened to disrupt public access to social media services and the Internet.
[22] Internet Peer-to-Peer Interconnect map at http://www.nthelp.com/images/interconnect.jpg — accessed 26 May 2013.
[23] The OC-3 Network cities include New York-Philadelphia-Washington-Atlanta; Chicago-St. Louis-Dallas-Phoenix; and Los Angeles-San Francisco-Seattle. See http://www.nthelp.com/images/agis.jpg
[24] The 6 cables are AC-1, AC-2, Apollo, FA-1, Tata TGN-Atlantic and TAT-14. In fact, all of these cables, except AC-2 run in loops across the Atlantic, meaning that there are 6 distinct NY-UK transatlantic cable systems with 11 segments allowing for even greater traffic.
[25] In 1859 a combination of multiple Coronal Mass Ejections (CME) caused the Northern Lights to be visible as far south as Panama. This was known as the Carrington Event. It has been estimated that this event was as much as ten times more intense than the solar storm the dropped the Quebec power grid in 1989. See http://news.nationalgeographic.com/news/2011/03/110302-solar-flares-sun-storms-earth-dangercarrington-event-science/ — accessed 15 December 2013
[26] Through “open shortest path first” or OSPF, domain networks re-route traffic through the shortest available path by following a distinct routing policy. The OSPF protocol is usually employed internal to an AS and automated such that the shortest/ lowest cost route in terms of transit time is always chosen. It would not necessarily use preferred pathways (e.g., those run by technologically reliable vendors and transiting friendly states) and avoid certain other pathways (such as those operated by an unfriendly power).
This speech was delivered at the 11th Scientific conference of the International Research Consortium on Information Security, as part of the International Forum on «Partnership of state authorities, civil society and business community in ensuring international information security», held on 20-23 April 2015 in Garmisch-Partenkirchen, Germany. It is published on Digital.Report with an explicit permission from the conference organizers.