Private Information Disclosure from Web Searches (or how to reconstruct users’ search histories)

People

Claude Castelluccia (INRIA), Emiliano De Cristofaro (UCI), Daniele Perito (INRIA)

News!

Several articles about this result about in the press:
[MIT Tech] [Slashdot] [ACM News] [The Register].

The paper is accepted for publication at PETS’10 and was presented during the poster session of the IEEE Security&Privacy Symposium.

Overview

As the amount of personal information stored at remote service providers increases, so does the danger of private information leakage. Leakage is easily achievable when connections to remote services are made in the clear and authenticated sessions are kept using HTTP cookies. The session hijack attack has threaten user privacy and security for several years. In fact, an attacker monitoring the network can capture an authentication cookie and impersonate a user (for more information, refer to: Researchers’ Letter, Cookiemonster).

Our research starts by analyzing the architecture of Google, the world’s largest service provider, and shows that many Google services are still vulnerable to simple session hijacking (with the exception of a few services accessible only over HTTPS such as Gmail).

Next, we focus on privacy leakages related to Google Web History. This service records all searches made by a Google signed-in user. The Web History is used to provide personalized results and keyword suggestions for searches that a user has already made. We design the Historiographer, a novel attack that reconstructs the web search history of Google users, even though this service is supposedly protected from session hijacking by a stricter access control policy. The Historiographer uses a reconstruction technique to infer search history from the personalized suggestions fed by the Google search engine. Its validity is confirmed through experiments conducted over real network traffic. We point out that our attacks are general, not specific to Google, and highlight that:

(1) Privacy issues are often created by mixed architectures using both secure and insecure connections.

(2) Web searches are privacy-sensitive and need to be carefully handled.

(3) Web search requests should be encrypted, for example with https, or personalization (of requests and results) should be deactivated.

Papers

A technical report (accepted for publication at PETS 2010) is available here.

Updates

February/March 2010: A preliminary version of our report is sent to Google: Search suggestions are suspended!. Google is investigating the problem carefully and decided to temporarily suspend the search suggestions from Search History and the Google Web History page is finally offered over HTTPS only.

March 1st, 2010: Bing starts a similar service to Google’s Web History and related suggestions. The only difference is that history is associated to an anonymous cookie stored on the local machine for a maximum time of 29 days. The privacy leakage for Bing would be limited to that. Contrarily, the leakage resulting from the historiographer on Google may be related to months and to searches conducted from different computers, as the history is associated to a Google account and not to an anonymous cookie.

March 15th, 2010: Google public statement: (attributable to Alma Whitten, Software Engineer, Google Security &Privacy:)

“We highly value our relationship with the security research community, and we are grateful to Dr. Castelluccia and his fellow researchers from INRIA (D. Perito) and University of California, Irvine (E. De Cristofaro) who have been in contact with us since the end of February about their findings related to open, unsecured network connections and personalized suggestion technology. Google has been and continues to be an industry leader in providing support for SSL encryption in our services, which is designed to address precisely the issues that all major websites face when transmitting information over http to users connecting via an unsecured network channel. Since hearing from the researchers, we have fixed the issues they raised by moving our Web History and Bookmarks pages to https, as well as encrypting the backend server requests associated with our personalized Maps suggestion service–and soon our Search suggestion service. We look forward to providing more support for SSL technologies across our product offerings in the future.”

March 15th, 2010: Our answers to Google’s statement :

Foreword: We would really like to acknowledge Google’s positive attitude toward our report and results. Google has been very responsive to our findings and is taking actions to fix them. We are very pleased about it.

(1) Moving Web History and Bookmarks pages to HTTPS is a great step towards improving privacy. However, it does not prevent potential leakage resulting from personalized suggestions.

(2) Google aims to counter the Historiographer attack by encrypting back-end server requests associated with the personalized Maps and (soon) Search suggestion services. We are unsure what this really means. Is Google planning to use HTTPS for all search requests of logged-in users? As of today (March 15th, 2010), none of these two countermeasures are implemented: Maps requests can still be reconstructed and Search personalized suggestions are still deactivated (the service is degraded).

(3) The Historiographer attack is still applicable to iPhones, since personalized suggestions have not been deactivated on mobile terminals (see new section 4.4 in the latest version of our report) — because of their constrained user interfaces, suggestions are much more critical and useful on phones (suggestion deactivation is therefore more problematic). Note that, as opposed to Maps history, Google uses a different web history for phones (a typical user has two web histories: one for the requests performed from regular computers, and another for requests performed from phones). The current exposure is therefore limited to the phone web history, not the “regular” web history.

(4) Web History information can still be leaked from Personalized results (see new section 3.3 in our report).

(5) Session hijacking is still possible on some services, although Web History and Bookmarks have been moved to HTTPS.

March 16th, 2010: We informed Bing of our results and sent them our report.

March 18th, 2010: Our report is published on arxiv.

March 23rd, 2010: Google’s Maps suggestions are now sent over HTTPS (but the Maps service is still accessible via HTTP).

April 21st, 2010: MIT Technology Review has an article on our paper.

April 26th, 2010: Personalized Search Suggestions have been reactivated. They are now sent over HTTPS.

May 21st, 2010: Google Launches Encrypted Search! (however,
does not seem to work on smart phone yet).

Related Links