Yet what is possible in public health is not always so easy in national security. Western intelligence agencies must contend with laws governing how private data may be gathered and used. In its paper, GCHQ says that it will be mindful of systemic bias, such as whether voice-recognition software is more effective with some groups than others, and transparent about margins of error and uncertainty in its algorithms. American spies say, more vaguely, that they will respect "human dignity, rights, and freedoms". These differences may need to be ironed out. One suggestion made by a recent task-force of former American spooks in a report published by the Centre for Strategic and International Studies (CSIS) in Washington was that the "Five Eyes" intelligence alliance—America, Australia, Britain, Canada and New Zealand—create a shared cloud server on which to store data.
然而,在公共衞生領域可行的事情在國家安全領域並不總是那麼容易做到。西方情報機構必須應對有關如何收集和使用私人數據的法律。GCHQ在其論文中表示,它會注意系統偏見,比如語音識別軟件對某些羣體是否比其他羣體更有效,以及算法的誤差和不確定性的界限是否透明。美國間諜含糊表示他們將尊重“人的尊嚴、權利和自由”。這些分歧可能需要解決。在華盛頓戰略與國際研究中心(CSIS)最近發佈的一份報告中,前美國特工小組提出了一個建議:“五眼”情報聯盟——美國、澳大利亞、英國、加拿大和新西蘭——可以創建一個用於存儲數據的共享雲服務器。
In any case, the constraints facing AI in intelligence are as much practical as ethical. Machine learning is good at spotting patterns—such as distinctive patterns of mobile-phone use—but poor at predicting individual behaviour. That is especially true when data are scarce, as in counterterrorism. Predictive-policing models can crunch data from thousands of burglaries each year. Terrorist attacks are much rarer, and therefore harder to learn from.
無論如何,人工智能在智能領域面臨的限制既是現實的,也是道德的。機器學習擅長識別模式——比如手機使用的獨特模式——但在預測個人行為方面卻很差。在數據匱乏的情況下尤其如此,比如在反恐行動中。預測警務模型可以處理每年數千起盜竊案的數據。恐怖襲擊要罕見得多,因此也更難從中吸取教訓。
That rarity creates another problem, familiar to medics pondering mass-screening programmes for rare diseases. Any predictive model will generate false positives, in which innocent people are flagged for investigation. Careful design can drive the false-positive rate down. But because the "base rate" is lower still—there are, mercifully, very few terrorists—even a well-designed system risks sending large numbers of spies off on wild-goose chases.
這種罕見帶來了另一個問題,正在考慮對罕見疾病進行大規模篩查的醫生對這個問題很熟悉。任何預測模型都會出現誤報,無辜的人被標記為調查對象。精心的設計可以降低誤報率。但是由於“基本比率”仍然較低——幸運的是,恐怖分子很少——即使是一個設計良好的系統也有可能使大量間諜在徒勞的追捕中喪命。
And those data that do exist may not be suitable. Data from drone cameras, reconnaissance satellite and intercepted phone calls, for instance, are not currently formatted or labelled in ways that are useful for machine learning. Fixing that is a "tedious, time-consuming, and still primarily human task exacerbated by differing labelling standards across and even within agencies", notes the CSIS report. That may not be quite the sort of work that would-be spies signed up for.
而那些確實存在的數據可能並不合適。例如,來自無人機攝像頭、偵察衞星和截獲的電話的數據,目前尚未格式化或標記為有利於機器學習的方式。CSIS的報告指出,解決這一問題是“一項乏味、耗時且主要還是人工的任務,各機構甚至各機構內部不同的標籤化標準加劇了這一問題”。這可能不是那些想成為間諜的人願意做的工作。
譯文由可可原創,僅供學習交流使用,未經許可請勿轉載。