網上版請按此
Time to use big data
In times of epidemics, how are we to strike a balance between protecting personal privacy and maintaining public safety?
Today, more than five billion people in the world have mobile devices, and many of them can't live without their phones. They represent a huge data pool that many researchers could tap to help manage an infectious disease outbreak, or locate trapped and injured individuals after an earthquake.
In 2017, GSMA, the association representing mobile operators worldwide, launched a "Big Data for Social Good" initiative that encourages telecoms groups to support responses to epidemics and natural disasters by sharing anonymised metadata.
More recently, media outlets looking to track the spread of the new coronavirus in and outside China used location data from tech giant Baidu's map app to analyse the travel patterns of the 5 million people who left Wuhan, the city at the centre of the outbreak, before it was closed off in January.
Now, experts such as Li Tie, chairman and chief economist of the China Centre for Urban Development, have advocated the use of big data to manage a crisis and reduce the risk of a future crisis.
As the Covid-19 outbreak intensified before the Lunar New Year holiday, China placed cities on lockdown and brought the economy to a virtual standstill. The epidemic is affecting sectors including catering, retail and tourism, and also hit Lunar New Year box office revenue. One analyst estimated economic losses reached 1 trillion yuan (US$143 billion) just in the first seven days of the holiday.
At the grass-roots level, some local authorities allowed only one person per family to leave the house for two hours a day. As Li pointed out in an article this month, all these drastic measures were "very effective in preventing the further spread of the epidemic, but at huge social and economic costs".
Instead, given that there are now 1.4 billion mobile devices in use in China, Li proposed using mobile phone signalling data to manage an epidemic.
These signals are regularly sent and received by a mobile phone each time it passes a base station in a telecoms network, but only if the phone is switched on.
There are three major mobile operators in China. If the authorities gather mobile phone signalling data from these operators, it is possible to monitor the whereabouts of the national population, or target particular regions or individuals round the clock within the country.
This means that people who have been exposed to a virus and placed on home quarantine should not be able to escape detection. In theory, it is possible to contain an epidemic without shutting down factories or an entire society.
And, in fact, the use of mobile phone data in crisis relief is becoming common. For example, after an earthquake killed more than 100,000 people in Haiti in 2010, researchers from Karolinska Institutet and Columbia University persuaded the largest mobile operator in Haiti to share the anonymised data of about 2 million users. Their data analysis allowed organisations to track the movement of earthquake survivors and plan relief operations accordingly.
In another case, researchers from the Massachusetts Institute of Technology created several models to trace the spread of dengue, a mosquito-borne virus, in Singapore. The model that used the anonymised call records of 2.3 million people from 2011 was found to be more effective.
Still, the use of such data, anonymised or not, raises questions about privacy. In China, people from Wuhan are already being doxxed, and having their home addresses, ID numbers and mobile phone numbers published on social media. An internet article puts it rhetorically: "Should the people of Wuhan still be allowed privacy?”"
Moreover, studies have shown that even when data is anonymised and aggregated, it is possible to re-identify an individual using just four data points.
This is why Nathaniel Raymond, from the Yale Jackson Institute for Global Affairs, has warned that mobile phone data might be improperly used by public and private organisations. If the data falls into the wrong hands, people in need of asylum, for example, might be victimised by human-trafficking groups. Therefore, comprehensive guidelines must be developed on the use of mobile data in crisis relief.
Susan Erikson, at Simon Fraser University, has also found limitations to the big data strategy. For example, mobile phone data did not prove useful in Ebola containment efforts in Sierra Leone, where the virus caused 4,000 deaths from 2014 to 2015.
One reason for the failure is the problem with big data analysis itself: analysis models are based on assumptions, and biased assumptions will generate biased results.
Most of us are accustomed to treating mobile phones like extensions of our individual selves. However, in Sierra Leone, mobile phones are "loaned, traded, and passed around among family and friends, like clothes, books, and bicycles," according to Erikson. "A single phone can be shared by an extended family or, in rural areas, a neighbourhood or a village." Therefore, an analysis model based on the assumption that a mobile phone is an extension of an individual was never going to work.
But what lessons can we learn from all this in Hong Kong? If Hong Kong is to become a truly smart city, the government should seize the opportunity thrown up by the outbreak and persuade telecoms operators to share big data with researchers.
But at the same, the authorities must develop comprehensive guidelines on the use of such data, to lay a solid foundation for smart health care in Hong Kong.
Dr. Winnie Tang
Adjunct Professor, Department of Computer Science, Faculty of Engineering and Faculty of Architecture, The University of Hong Kong