A search engine is a service that builds an index of the World Wide Web and gives users a way to search that index. The most popular search engine is Google, but it's not the only one, and as we'll see, search engines aren't all the same when it comes to data collection.
Now that search engines put an entire Web full of answers at our fingertips, it's tempting to use them to answer all of our burning questions.
Once we type our questions and press "Search", it's up to the search engine what they will do with the data.
Depending on which search engine we're using, our queries might be getting logged in a database and stored for all time.
A search query itself isn't typically private information - there are probably many people in the world who want to build a jet ski. However, the search engines can log much more than the query; they can add all sorts of potentially identifiable information.
A search query record in a database might look something like this:
|Search query||Date||Time||IP address||User agent|
|how can I build a jet ski?||March 11, 2020||11:14 AM||188.8.131.52||Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1|
If you repeatedly use the same search engine on the same computer and Internet connection (as many of us do, at home), your search queries will all contain the same IP address.
Consider what multiple queries would look like in that database:
|Search query||Date||Time||IP address||User agent|
|"how can I build a jet ski?"||March 11, 2020||11:14 AM||184.108.40.206||Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1|
|"home depot near crescent city"||March 11, 2020||4:00 PM||220.127.116.11||Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1|
|"cheap pizza delivery to 95543"||March 12, 2020||9:07 PM||18.104.22.168||Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1|
|"windsor family tree"||March 13, 2020||2:32 PM||22.214.171.124||Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1|
Search queries suddenly start to look a lot more like personally identifiable information.
Plus, the search history could also include a cookie or even a user ID if you were logged into the search engine website when you issued the query.
Uses of search history data
By storing both our queries and our identifying information, a search engine can personalize the search results.
For example, consider the search query "Python". If the searcher is a biologist and frequently searching biology related terms, the search engine might show them this as the first result:
If the searcher is instead a software developer and has many programming-related queries in their search history, the search engine might instead show this result:
For those programmers who don't like snakes, they might be very grateful for the personalized results. 🚫🐍
Search engines frequently include advertisements along with search results, since that's how they make enough money to keep operating a search engine for free. Once they start collecting a user's search history, the advertisements can be based on more than just the current query.
In addition to operating a search engine, Google also runs a very popular ads network which runs ads on millions of non-Google websites. The Google ads system can use search history to personalize the ads that show up on the non-Google websites.
For example, I once spent a day researching smart sensor networks for an article, and I still get served advertisements about smart sensors, even while reading a fashion blog.
🤔 When you see an ad on a site that seems personalized to your interests, do you feel happy that it's catering to you or mad that it knows you so well?
Risks of search history collection
From the perspective of the search engines, they're using your search history to personalize your experience and make it better.
There are dangers to any form of online data collection, however.
In 2005, the online media company AOL released three months of "anonymized" search data for researchers to use. Their anonymization strategy was to replace the username column in the data with a numeric ID. Each username was always replaced by the same numeric ID, which meant that researchers could group the data by numeric ID and see all the queries ever made by a user. 😬
In less than a week, journalists at the New York Times were able to deduce the identity of user number 4417749 by combing through her queries and piecing together tidbits of personal information.
She was shocked to discover all her search queries were publicly viewable and told the journalist, "My goodness, it’s my whole personal life. I had no idea somebody was looking over my shoulder.”
What's a user to do?
If you're suddenly feeling uncomfortable typing a query into a search engine, that's understandable. But don't worry, you don't need to swear off search engines for the rest of your life.
If you don't like how they're collecting the data but want to continue using the service, you can look for settings that will let you reduce or completely disable data collection. Not every search engine will offer such settings, but many will in order to accomodate the privacy-conscious users.
If you're open to using a different service, look for alternative offerings. For example, DuckDuckGo is a privacy-focused search engine that does store the search queries to improve features such as spelling correction but does not store IP addresses, user agents, cookies, or other potentially identifiable information.
🤔 Are you making any changes to your search behavior after learning more? What benefits or drawbacks are you anticipating to your new approach? Share them with us!
Want to join the conversation?
- does google collect search history data ?? if so is this data is publicly viewable(21 votes)
- Yes, google automatically collects data on your search history and websites you visit. As far as I know, the data is not publicly visible, but it uses the data to show personalized ads.(10 votes)
- Advertisers use our data all the time to try to cater advertisements to users. Can researchers use similar data in different ways for science?(13 votes)
- yes... they could partner with different websites, like they mentioned above, about the AOL case.(11 votes)
- Why is it legal for big companies like Google to do that? I've heard that they'll sell your information too :( Is that true? It can't be legal!(10 votes)
- Google does not directly sell your information. They use it however, to personalize your experience and give advertisers a scope of how many people would interact with their ads, based on the information they have of users like you and I. No advertiser is able to access any information you store in your Google account.(8 votes)
- What do you guys think about Open Source browsers? I use Brave sometimes, and I would like to know if OS browsers really are safer than those which aren't.(13 votes)
- Does google use your information other then keep you logged into websites.(8 votes)
- Google will also track what websites you visit and what terms you search in order to show personalized advertisements. (This can be stopped in the Google Chrome settings.)(5 votes)
- so can some sites know who you are just by what you search up?(5 votes)
- not exactly... Although search results sometimes are personal, they are also at times vague to track down your identification. However, if you frequently search up things relating to you location, your school or even your personality, someone could have a good idea of who you are(6 votes)
- Could somebody please elaborate on the options for changing search engine settings and what the effects would be? What settings are most private? Which ones are most helpful?(7 votes)
- You should change settings related to cookies, tracking, password saving, site data, permissions and data collection. Firefox excels on privacy and is OS if privacy is really what you are looking for.(1 vote)
- i went on water.com u get free water(5 votes)
- Is there any way to prevent google from selling your info?(6 votes)
- Google does not directly sell your information. You can however, turn off Personalized Ads on your Google Account to stop Google from using your activity to serve you personalized ads.(3 votes)
- is it legal to hack the precedent's devices(3 votes)