Monday, September 17, 2018

E-Resilience in support of Emergency Communications

United Nations Economic and Social Commission for Asia and the Pacific (UN-ESCAP), and their Asia Pacific Information Superhighway (AP-IS) initiative, might consider offering their member states:
Emergency Communications
Resiliency Stack
  1. A set of tools and methodologies for technology stewards to assess their own E-Resilience in their organizations and communities; then, supply the quantitative and qualitative findings to include in an AP-IS database for researchers and practitioners to use in analyzing national, cross-boarder, and regional strategies for addressing E-Resilience.
  2. Best-practices for developing community centered communications networks with options for reliable and proven back-haul and interconnection; along with their resilience to various disaster, geographic and socioeconomic constraints.
  3. Guidelines for building Business Continuity - Disaster Recovery Plans (BC-DRPs) that comply with emergency communications requirements; taking into consideration survivability & availability and Rapid Restoration of Access to Telecommunication (RReAcT) programs
These were three key recommendations contributed to the 2nd session of the AP-IS steering committee and WSIS regional review meeting held 27th & 28th September 2018, UN Conference Center in Thailand. The event was a precursor to the Committee on Information and Communications Technology & Science, Technology and Innovation, Second session.

The main contribution, of my talk, was to cover E-Resilience: i.e. resilient ICT networks (or better termed as "ICT services"), Support for disaster management systems, and ensuring last-mile disaster communication. The AP-IS initiative aims to enhance the resilience of existing/planned ICT infrastructure through methods such as enhanced network diversity, while recognizing the importance of resilient infrastructure to sustainable development and the critical role played by ICT in disaster risk reduction and management.

VIEW SLIDES - e-Resilience in support of Emergency Communication: “contingencies" 

1. Telecom Resilience Analysis Tools and Methodologies


ITU has mapped the undersea and terrestrial networks and interconnections. Member states may officially request for the underlying raw GIS data? However, the raw data is unavailable for the public (researchers and practitioners) for any kind of augmented analysis. For example, we can apply simple max-flow min-cut or dynamic flow algorithms to determine the optimal resiliency strategies. Open data for resilience initiatives is highly advocated by the  Global Facility for Disaster Reduction and Recovery (GFDR).

ITU's data can be used in a tools such as the Sahana Community Resilience Mapping Tool (CRMT) implemented for the Los Angeles County. The Sahana CRMT would use the inherent capabilities to overlay risk maps (hazard, vulnerability, and exposure GIS data) with the ITU infrastructure GIS data to analyze the vulnerabilities and then manage the mitigation plans on the telecom infrastructure. Moreover, it would allow for member states to manage and update their own jurisdictional information sets for the greater good.

Another effective tool and methodology is the Risk Assessment and Step-wise Refinement (RASTER). The tool allows for organizations and communities to model their critical infrastructure. It uses 5 basic components: actor, wireless, wired, equipment, and cloud (unknown) to link them and model the system architecture. Thereafter, apply a participatory approach to define the frequency of various threats and the impact (based on a likert scale) on the individual components that enable the service(s). The tool analyzes the data to, then, propose "quick win" mitigation strategies of the single and common points of failure. It is a simple and easy to use tools; free for all to adopt. LIRNEasia, recently, demonstrated the use of the tool and engaged participants in analyzing a scenario at APrIGF2018.

2a. Community Networks for bridging the Last-Mile


While Government and International Humanitarian Organizations may have access to various satellite (e.g. VSAT), high altitude platform (e.g. facebook's Aquila), and terrestrial technologies, the Public is limited to 3GPP and ADSL technologies. Community networks are emerging as ways for extending the Internet to the marginalized. However, such community networks must be resilient to secure the continuity of essential public services. Moreover, they should also be sustainable and serve as an economically viable resource for the RReAcTs. LIRNEasia, along with UN-ESCAP, ISOC, and AIT IntERLab, demonstrated such a resilient community network-based RReAcT solution at APrIGF2017.
Source: AIT IntERLAb CWMN

The demonstrated solution was based on an AIT IntERLab home-brewed mesh-network. The technology adopts a simple business model to connect marginalized communities. Basically, IntERLab floated startup: Taknet buys unlimited capacity for THB 750/mo per access point from the ISP. Thereafter, they optimally redistribute that capacity in the community. TakNet charges each user, accessing the pasticular community network, THB 250/mo. That generates enough income to support a Technician in the community and the upkeep of the hardware (routers and access points).

We also heard from CViSNET about the Nippon Telegraph and Telephon’s Movable Deployable Resource Unity (MDRU); developed after the 2011 Japan Earthquake and used in 2013 Cyclone Haiyan in the Philippines. The MDRU turnkey solution, especially the grab-n-go suitcase is quite versatile for rapidly restoring telecoms. It is capable on latching on to any 3GPP or ADSL back-haul to offer voice and data services.

Besides supporting RReAcT, the community networks serve a supplementary contribution to the AP-IS pillar: “broadband for all”. The initiative aims to bridge the digital divide, promote affordable access to under-served areas, and policy and technical support to Governments. Asia Pacific Network Information Center (APNIC) and the Internet Society (ISOC) are key advocates of community networks and are jointly investing resources in expanding such community networks.

2b. The Back-haul, critical single point of failure


Reporting delays since day of earthquake;
mountain areas of Nepal.
The last-mile community networks heavily rely on a single ISP back-haul and that is their weakest link. What we have realized from various studies, including our most recent report from the 2015 Nepal Earthquake, was ISP’s did not have good contingencies to quickly bounce back from disasters.
For example, an analysis, making use of large volumes of relevant social media data and a density-based clustering algorithm, revealed that it was not until the 4th day (pass the "72 golden hour") that public reports started coming in. The map to the left shows foot soldiers had to hike from one village to another to transmit incident reports.

A similar study in India learned that during the 2014 Jammu & Kashmir floods, one telco's base station equipment was submerged but the other's, on the second flood was unharmed. Although both telcos agreed to share the infrastructure, as per the Indian "intra circle roaming agreement" it took their engineers a day and a half to configure the systems to revive services; still that was 15 days after the onset of the flood.

We have also learned that point-to-point WiMax (2.4 & 5.2 GHz) for mountainous areas, LTE (700 MHz band) radio for Mars like terrain, and LEO satellites for small islands are proven as best-practices. Long distance point-to-point works well between mountains in the absence of obstacles. High frequency radio waves work well in the presence of obstacles but limits the distance. WiMax and LTE nodes require very little power. Nepal study revealed that service providers were able restart services using solar powered batteries and make-shift bamboo hoists for antennas. Why not have these configurations ready on a USB stick for quick plug-n-play?

TV White Space is emerging as an alternative for back-haul. Koreans, specifically the company Innonet, have working solutions that use TV White Space, proven economical, to use with specific surveillance solutions; i.e. networks of CCTV cameras. There are many such solutions that ESCAP AP-IS might consider classifying and documenting as best-practices for member states to consider as solutions in their emergency communications plans.

3. BC-DRP and Key Indicators for assessing the effectiveness


For survivability & availability and RReAct programs to come together, there is a need for practical and proven BC-DRPs. For such there is a growing need for guidelines, best-practices, and checklists to ensure telecom service provider meet emergency communication standards; typically stronger than normal business contingency plans. It would also require revised language in "service level agreements" between providers and regulators.

Components and inter-dependencies of ICT resilience
in support of emergency communications.
ICT Resilience in support of emergency communication requires near 100% survivability & availability and RReAcT programs to support real-time data streams for enabling a Common Operating Picture for emergency services and the public. Survivability and availability is a function of the telecommunications exposure to risk and risks, as we know, are inevitable. Therefore, telecommunications service providers must establish practical and proven BC-DRPs with sturdy telecom services and complementing RReAcT programs. For such, “key risk indicators” (KRIs) and “key performance indicators” (KPIs) must be established and agreed upon with the service levels set forth by emergency communications planners.


BC-DRP is a combination of preparedness and response with practical & proven plans. Preparedness must secure the continuity of the ICT services by setting KRIs that govern the survivability and availability. It is determined by realizing the incident “frequencies” and “impact” on the essential emergency services, then mitigating those vulnerabilities. One may apply a 80/20 rule to define factors: congestion, damage/break, power, interference, etc. Implementers must also include “social risk” in the KRIs as well. Typical social risks consider the affects on children, women, & elderly, trust in public goods, the amount of fear, so on.

KPIs are, typically, defined by the Mean Time To Failure (MTTF) & Mean Time To Repair or Recovery (MTTR). Some chose to use Mean Time Between Failure (MTBF) instead of MTTF. Ideally, the sum of MTTF (or MTBF) and MTTR must be zero (MTTF + MTTR = 0). However, it is practically impossible, given that we live in a world full of uncertainties. Therefore, national emergency communication planners may define their national MTTR to be less than 08 hours. To achieve such an ambitious KPI one must partner with various stakeholders. For example, agree with the domestic aviation industry to provide emergency transportation of communications equipment in time of a crisis.

The response component, of a BC-DRP, should consider a solid RReAcT program that make economic sense. That is achieved by setting the Recovery Time Objectives (RTOs). The RTOs must consider the human factors that are often neglected. Best-practice is to set service-based recovery times (e.g. Data first and Voice last or vise versa). Secondly, Recovery Point Objectives (RPOs) prioritize essential services; i.e. which organizations in the geographic locations must attain what capacity.

Conclusion

There is a lot to be achieved, if we are "to leave no one behind". AP-IS seems to be gaining momentum with setting steppingstones and building pathways to intertwine the essential resources and support from the member states. We are committed to AP-IS and would work with ESCAP in providing our expertise in bridging the gaps; especially, in the areas of the 3 key recommendations prescribed at the onset of this blog.